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(57) Abstract 

The invention relates to the transcriptional regulatory sequence CTRS) of carcinoembryonic antigen (CEA) and molecular chiraaera 
comprising the CEA TRS and DNA encoding a heterologous enzyme. CEA TRS is capable of targeting expression of the heterologous 
enzyme to CEA+ cells and the heterologous enzyme is preferably an enzyme capable of catalysing the production of an agent cytotoxic or 
iiytostatic to CEA+ cells. For example the enzyme may be cytosine deaminase which is capable of catalysing formation of the cytotoxic 
compound 5-fluorouracil from the non toxic compound 5-fluorocytosine. 
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TRANSCRIPTIONAL REGU LATORY SEQ UENCE OF CARCINOEMBRYONIC ANTIGEN FOR 

EXPRESSION TARGETING 

The present invention relates to a transcriptional r^latoiy sequence use&l in 
gene therapy. 

Colorectal cardnoma (CRC) is the second most frequent cancer and the second 
leading cause of cancer-assodated deaths in the United States and Western Europe. The 
overall five-year survival rate for patients has not meaningfiilly improved in the last three 
decades. Prognosis for the CRC cancer patient is associated with the depth of tumor 
penetration into the bowel wall, the presence of regional lymph node involvement and, 
most importantly, the presence of distant metastases. The liver is the most common ate 
for distant metastasis and, in approximately 30% of patients, the sole initial site of tumor 
recurrence after successfiil resection of the primary colon cancer. Hepatic metastases are 
the most common cause of death in the CRC cancer patient. 

The treatment of choice for the majority of patients with hepatic CRC metastasis 
is systemic or regional chemotherapy using S-fluorouracil (5-FU) alone or in combination 
with other agents such as leviamasole. However, despite extensive eflfort, there is still no 
satisfactory treatment for hepatic CRC metastasis. Systemic single- and combination- 
agent chemotherapy and radiation are relatively inefifective emphasizing the need for new 
approaches and therapies for the treatment of the diseases. 

A gene therapy approach is being developed for primary and metastatic liver 
tumors that exploits the transcriptional differences between normal and metastatic cells. 
This approach involves linking the transcriptional regulatory sequences (TRS) of a tumor 
associated marker gene to the coding sequence of an enzyme, typically a non- 



RECTREDSHEEHRULESD 



wo 95/14100 



PCT/GB94/02546 



aaamalian snzyae, to' create an artificial chiaaeric gen* 
that is selectively. expressed in cancer cells. The enzyme 
should be capable of converting a non-toxic prcdrug into a 
cytotoxic or cytostatic drug thereby allowing for selective 
elimination of metastatic cells. 

The principle of this approach has been demonstrate^ 
using an alpha-fetoprotein/Varicella Zoster virus thymidine 
kinase chiziaera to target hepatocellular carcLnona with the 
enzyme metabolically activating the non-toxic prodrug 6- 
aethoxypurLne arabinonucleoside ultimately leading to 
formation of the cytoxic anabolita adenine 
arabinonucleoside triphosphate (see Huber et al . Proc. 
Natl. Acad. Sci U.S.A., 13, 8039-8043 (1991) and EP-A-n a->c 
15 731). ' 



10 



For the treatment of hepatic metastases of CRC, it is 
desirable to control the expression of an enr/me with the 
transcriptional regulatory sequences of a tumor marker 
20 associated with such metastases. 

CEA is a tumor associated marker that is regulated at 
the transcriptional level and is expressed bv most CRC 
tumors but is not expressed in normal liver. CSA is widely 
used as an important diagnostic tool for postoperative 
surveillance, chemotherapy efficacy determinations, 
ismunolocalisation and immunotherapy. The TRS of CEA are 
potentially of value in the selective expression of an 
enzyme in CEA* tumor cells since there appears to be a very 
low heterogeneity of CEA within metastatic tumors, perhaps 
because CZA may have an important functional role in 
metastasis. 



25 



30 



35 



The cloning of the CZA gene has been reported and the 
promoter localised to a region of 424 nucleotides upstr^eam 
from the translational start .(Schrewe et al, Mol. Cell. 
Biol. , iO, 2738 - 2748 (1990) but the full TRS was not 
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identified. 

In the work on which the present invention is based, CEA genomic clones have 
been identified and isolated fi-om the human chromosome 19 genomic library LL19NL01, 
ATCC nimiber S7766, by standard techniques described hereinafter. The cloned CEA 
sequences comprise CEA enhancers in addition to the CEA promoter. The CEA 
enhancers are especially advantageous for high level expres^on in CEA-positive cells and 
no ^ression in CEA-negative cells. 

According to one aspect, the presmt invention provides a DNA molecule 
compri^g the CEA TRS but without associated CEA coding sequence. 

According to another aspect, the present invention provides use of a CEA TRS 
for and targeting expression of a heterologous enzyme to CEA* cells. Preferably the 
enzyme is capable of catalysing the production of an agent cytotoxic or cytostatic to the 
CEA* target cells. 

As described in more detail heremafter, the present inventors have sequenced a 
large part of the CEA gene upstream of the coding sequence. As used herein, the term 
"CEA TRS" means any part of the CEA gene upstream of the coding sequence which 
has a transcriptional regulatory effect on a heterologous coding sequence operably linked 
thcKto. 

Certain parts of the sequence of the CEA gene upstream of the coding sequence 
have been identified as making significant contributions to the transcriptional regulatory 
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effect, more particularly increasiiig the level and/or selectivhy of transoiptioa 
Preferably the CEA TRS inchides all or part of the region from about -299b to about 
+69b, more preferably about -90b to about +69b. Increases in the level of transcription 
and/or selectivity can also be obtained by including one or more of tije foflowing regions: 
-14.5kb to -10.6kb, preferably -13.6kb to -10.61cb, and/or -6.llcb to -3.8kb. AH of the 
regions referred to above can be included m either orientation and in different 
combinations. In addition, repeats of tiiese regions may be included, particulariy repeats 
of tfie -90b to +69b region, contiuning for example 2,3, 4 or more copies of tiie regioa 
The base raimbering refers to the sequence of Figure 6. The regions referred to are 
inchided in tiie plasmids described in figure 5B. 

C3ene tiierapy involves tiie stable integration of new genes into tiuget cefls and tiie 
expression of tiiose genes, once tfiey are in place, to alter tiie phenotype of tiiat particular 
target cell (for review see Anderson, W.R Science 226. 401-409 (1984) and 
McGonmick, D. Biotechnology 3. 689-693, (1 985)). Gene tiierapy may be beneficial for 
tiie treatment of genetic diseases tiiat mvolve tiie replacement of one defective or missing 
enzyme, such as; hypoxantiiine-guanine phosphoribosyl transferase in Lesch-Nyha 
disease, purine nucleoside phosphorylase in severe immunodeficiency disease, and 
adenosine deaminase in severed combined immunodeficiency dis 



aisease. 



It has now been found tiiat it is possible to selectively arrest tiie growtii o^ 
kill, mammalian carcmoma cells witii prodrugs, i.e. chemical agents capable 



or 
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of selective conversicn to cytotoxic (causing cell death) 
or cytostatic (suppressing cell multiplicatica and growth) 
metabolites. This is achieved by the construction of a 
molecular chimaera ccasrising a . "target tissue-specific" 
TRS that is selectively activated Lz target calls, such as 
cancerous cells, and that controls the expression, of a 
heterologous enzyme. This molecular chiaaera may be 
manipulated via suitable vectors and incorporated into an 
infective virion. Uocn administration of an infective 
vision containing the aolecu.lar chiaaera to a host (e.g., 
mammal or human), the enzyme is selectively expressed in 
the target cells. Administration of prodrugs (compounds 
that are selectively metabolised by the enzyme into 
metabolites' that are either further metabolised to or are, 
in fact, cytotoxic or cytostatic agents) can then result iL 
the production of the cyzotoxic or cytostatic agent in situ 
in the cancer cell. According to the present invention CEA 
TRS provides the target tissue specificity. 

Molecular chimaeras (recombinant molecules comprised 
of unnatural combinations of genes or sections of genes), 
and infective virions (complete viral particles capable of 
infecting appropriate host cells) are well known in the art 
of molecular biology. 



A number of enzyme prodrug combinations may be used 
for the above purpose, providing the enzyme is capable of 
selectively activating the administered compound either 
directly or through an intermediate to a cytostatic or 
30 cytotoxic metabolite. The choice of compound will also 
depend on the enzyme system used, but must be selectively 
metabolised by the enzyme either directly or indirectly to 
a cytotoxic or cytostatic metabolite. The term 
heterologous enzyme, as used herein, refers to an enzyme 
that is derived from or associated with a species which, is 
different from the host to be treated and which will 
display the appropriate characteristics of the above 
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aentioned selectivity. m addition, it vill also be 
appreciated that a heterologous enzyne may also refer to a.^ 
enzyme that is derived from the host to be treated that has 
been modified to have unique characteristics unnatural tc 
5 the host. 

The enzyme cytosine deaminase (CD) catalyses the 
deamination of cytosine to uracil. Cytosine deaminase is 
present in microbes and fungi but absenz in higher 
10 ev^aryotes. This enzyme catalyses the hydrolytic 
deamination of cytosine and 5-flucrocytosins (5-FC) tc 
uracil and 5-fluorouracil (5-FU) , respectivslv. sinci 
mammalian cells do not express significant amounts o- 
cytosine deaminase, they are incapable of converting 5-Fc 
to the toxic metabolite S-FU and therefore 5-fluorocytosine 
IS nontoxic to mammalian cells at concentrations which are 
effective for antimicrobial activity. 5-Fluorouracil is 
highly toxic to mammalian cells and is widely used as an 
anticancer agent. 



15 



20 



25 



m mammalian cells, some genes are ubiquitously 
expressed. Most genes, however, are expressed in I 
temporal and/or tissue-specific manner, or are activated in 
response to extracellular inducers. For examcle, certain 
genes are actively transcribed only at very precise times 
in ontogeny in specific cell types, or in response to some 
inducing stimulus. This regulation is mediated in part by 
the interaction between transcriptional regulatory 
sequences (for example, promoter and enhancer regulatory 
30 DHA sequences), and sequence-specific, DNA-binding 
transcriptional protein factors. 

It has now been found that it is possible to alter 
certain mammalian cells, e.g. colorectal carcinoma cells 
35 metastatic colorectal carcinbma cells and hepatic 
colorectal carcinoma cells to selectively express* a 
■ heterologous enzyme as hereinbefore defined, e.g. CD. This 
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is achieved by the construction of molecular chiiaaeras in 
an expression cassette* 

Expression cassettes themselves are well known in the 
5 art of Bciecular biology. such an expression cassette 
•contains all essential DNA sequences re^ruired for 
expression of the heterologous enzyne in a maraalian cell. 
For example, a preferred expression cassette will contain 
a molecular chimaera containing the coding sequence for CD, 
3 an appropriate polyadenylatipn signal for a maimalian gene 
(i.e., a pclyadenylaticn signal that will function in a 
nanmalian call) , and CZA enhancers and promorer sequences 
in the correct orientation. 



Normally, two DNA sequences are required for the 
complete and efficient transcriptional regulation of genes 
that encode messenger RNAs in mammalian cells: promoters 
and enhancers. Promoters are located immediately upstream 
(5') from the start site of transcription. Promoter 
sequences are required for accurate and efficient 
initiation of transcription. Different gene-specific 
promoters reveal a common pattern of organisation. A 
typical promoter includes .an AT-rich region called a TATA 
box (which is located approximately 30 base pairs 5' to the 
transcription initiation start site) and one or more 
upstream promoter elements (UPEs) . The UPEs are a 
principle target for the interaction with sequence-specific 
nuclear transcriptional factors. The activity of promoter 
sequences is modulated by other sequences called enhancers. 
The enhancer sequence may be a great distance from the 
promoter in either an upstream (5') or downstream (3') 
position. Hence, enhancers operate in an orientation- and 
posit ion- independent manner. However, based on similar 
structural organisation and function that may be 
interchanged, the absolute distinction between promot^ers 
and enhancers is somewhat arbitrary. Enhancers increase 
the rate of transcription from the promoter sequence. It 
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is predominantly the ir.reraction between sequence-specific 
transcriptional factors vith the U?Z and enhancar sequences 
that enable mammalian calls to achieve tissue-specific gene 
expression. The presence of these transcriptional protein 
5 factors (tissue-specific, trans-ac-ivating factors) bounc 
to the DPE and enhancers (cis-acting, regulatory sequences) 
enables other components of the transcriptional machinery, 
including RNA polymerase, to initiate transcription with 
tissue-specific selectivity and accuracy. 



15 



The transcripticnal regulatory sequence for CEA is 
suitable for targeting expression in colorectal carcinoma, 
metastatic colorectal carcinoma, and hepatic colorectal 
aetastases, transforaed cells of the gastrointestinal 
tract, lung, breast and other tissues. By placing the 
expression of the gene encoding CD under the 
transcriptional control of the CRC-assopiated marker gene, 
CZA, the nontoxic compound, 5-FC, can be metabolically 
activated to 5-FU selectively in CRC cells (for example* 
20 hepatic CRC cells) . An advantage of this system is that 
the generated toxic compound, 5-f luorouracil, can diffuse 
out of the cell in which it was generated and )cill adjacent 
tumor cells which did not incorporate the artificial gene 
for cytosine deaminase. 



25 



In the work on which the present invention is based, 
CEk genomic clones were identified and isolated from the 
human chromosome 19 genomic library LL19NL01, ATCC number 
57766, by standard techniques described hereinafter. The 
30 cloned CEk sequences comprise CEk enhancers in 

addition to the CEA promoter. The CEA enhancers are 
especially advantageous for high level expression in CEA- 
positive cells and no expression in CEA-negative cells. 

present invention further provides a molecular 
chimaera comprising a CEA TRS and a DNA sequence 
operatively linked thereto encoding a heterologous enzyme. 
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prsfarably an enzyme ' capable of catalysing the production 
of an agent cytotoxic cr cytostatic to the CZ?/ cells. 

Th^ present invention further provides a molecular 
chiaaera ccnprising a DUX sequence containing the coding 
sequence of the gene that codes for a heterologous .enzyme 
under the control of a CEA TRS in an expression cassette. 

The present invenrion further provides in a preferred 
embodiment a molecular chimaera comprising a CZA TRS which 
is operatively linked tc the coding sequence for the gene 
encoding a non-mammalian cytosine deaminase (CD) . The 
molecular chimaera comprises a promoter and additionally 
comprises an enhancer. 

In particular, the present invention provides a 
molecular chimaera coarorising a DNA sequence of the coding 
sequence of the gene coding for the heterologous enzyme, 
which is preferably CD, additionally including an 
appropriate polyadenylation sequence, which is linked 
downstream in a 3' position and in the proper orientation 
to a CEA TRS. Most preferably the expression cassette also 
contains an enhancer sequence. 

Preferably non-mammalian CD is selected from the group 
consisting of bacterial, fungal, and yeast cytosine 
deaminase. 



The molecular chimaera of the present invention may be 
made utilizing standard recombinant DNA techniques. 

Another aspect of the invention is the genomic CEA 
sequence as described by Seq .IDI. 

The coding sequence of CD and a polyadenylation signal 
{for example see Seq IDs i and 2) are placed in the 
proper 3 ' orientation to the essential CEA transcriptional 
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regulatory eleaents." This molecular chinaers enables th» 
selective expression of CD in cells or tissue liat noraallV 
express ci?.. Expression of the CD gene in raaaalian CRc 
and metastatic CRC (hepatic colorectal carcinoma 
metastases) vill enable nontoxic 5-FC to be salectivelv 
metabolised to cytotoxic 5-FU. 

Accordingly, in a another aspect of the present 
invention, rhere is provided a method of ccnsrrjcting a 
molecular chimaera ccnprising linking a SNA sequence 
encoding a heterologous enzyme gene, e.g. CD, zz a CSk TRS. 

In particular the present invention provides a method 
of constructing a molecular chimaera as herei.n da-'ned 
method comprising ligating a DNA sequence encoding the 
coding sequence and polyadenylation signal of the cene for 
a heterologous enzyme (e.g. non-maamalian CD) to a'cEA TRS 
(e.g., promoter sequence and enhancer sequence). 

« 

These molecular chimaeras can be delivered to the 
target tissue or cells by a delivery svstem. For 
administration to a host (e.g., mammal or human), it is 
necessary to provide an efficient in vivo delivery system 
that stably incorporates the molecular chimaera into the 
25 cells. Known methods utilize techniques of calcium 
phosphate transfection, electroporation, microinjection, 
liposomal transfer, ballistic barrage, DNA viral infection 
or retroviral infection. For a review of this subject see 
Biotechniques 1, Wo. 7, (1988). 



15 



20 



30 



The technique of retroviral infection of cells to 
integrate artificial genes employs retroviral shuttle 
vectors which are known in the art (Miller A.D., Buttimore 
C. Mol. cell. Biol. 6, 2895-2902 (1986)). Essentially, 
35 retroviral shuttle vectors (retroviruses comprising 
molecular chimaeras used to deliver and stablv integrate 
the molecular chimaera into the genome of the target cell) 
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10 

are generated using the DNA form of the retrovirus 
contained in a plasaid. These plasmids also contain 
sequences necessary for selection and growth in bacteria. 
Retroviral shuttle vectors are constructed using standard 
aolecular biology techniques well known in the art. 
Retroviral shuttle veczors have the parental .endogenous 
retroviral genes (e.g., siZ, Eoi and sn^L) rencved from the 
vectors and the DNA senience of interest is inserted, such 
as the molscular chimasras that have been described. The 
vectors also contain appropriate retroviral regulatory 
sequences f=r viral encapsidation, proviral i.isertion into 
the target genome, aessage splicing, teraLnation and 
polyadenylaticn. Retroviral shuttle vectors have been 
derived frsa the Moloney murine leukaemia virus (Mo-MLV) 
15 but it will be appreciated that other retroviruses can be 
used such as the closely related Moloney murine sarcoma 
virus. Other DNA viruses may also prove to be useful as 
delivery systems. The bovine papilloma virus (bpv) 
replicates extrachromosomally, so that delivery systems 
based on BPV have the advantage that the delivered gene is 
maintained in a nonintegrated manner. 



20 



Thus according to a further aspect of the present 
invention there is provided a retroviral shuttle vector 
25 comprising the molecular chimaeras as hereinbefore 
defined. 



30 



35 



The advantages of a retroviral-mediated gene transfer 
system are the high efficiency of the gene delivery to the 
targeted tissue or cells, sequence specific integration 
regarding the viral genome (at the 5' and 3' long terminal 
repeat (LTR) sequences) and little rearrangements of 
delivered DNA compared to other DNA delivery. systems. 

Accordingly in a preferred embodiment of the present 
invention there is provided .a retroviral shuttle vector 
coaqprising a DNA sequence comprising a 5' viral LTR 
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sequence, a cis-actir.g psi-encapsidation sequence, a 
aolecular chiaaera as hereinbefore defined and a 3' viral 
LTR sequence. 

In a preferred eahcdiaent, and to help eliminate non- 
tissue-specific expression of the aolecular chiaaera, the 
aolecular chiaaera is placed in opposite transcriptional 
orientation to the 5' retroviral LTR. In addition, a 
doainant selectable marker gene may also be included that 
is transcriptionally driven. from the 5' LTR sequence. Such 
a dominant selectable aarker gene aay be the bacterial 
neomycin-resistance gene NEO (aminoglycoside 3' phospho- 
transferase type II), vhich confers on eukarvotic cells 
resistance to the neoaycin analogue Genetici.- (antibiotic 
15 G418 sulphate; registered trademark of GIBCO) . The NEO 
gene aids in the selection of packaging cells that contain 
these sequences. 



10 



20 



25 



30 



35 



The retroviral vector is preferably based on the 
Moloney murine leukaemia virus but it will be appreciated 
that other vectors may be used. Vectors containing a NEO 
gene as a selectable marker .have been described, for 
example, the N2 vector (Eglitis M.A., Kantoff P., Gilboa 
E. , Anderson W.F. Science 230 . 1395-1398 (1985)). 

A theoretical problem associated with retroviral 
shuttle vectors is the potential of retroviral long 
terminal repeat (LTR) regulatory sequences 
transcriptionally activating a cellular oncogene at the 
site of integration in the host genome. This problem may 
be diminished by creating SIN vectors. siN vectors are 
self-inactivating vectors that contain a deletion 
comprising the promoter and enhancer regions in the 
retroviral LTR. The LTR sequences of SIN vectors do not 
, transcriptionally activate 5' or 3' genomic sequences. -The 
transcriptional inactivation ^of the viral LTR sequences 
diminishes insertional activation of adjacent target cell 
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DMA sequences and also aids in the selected expression of 
the delivered molecular chiaaera. SIN vectors are create:' 
by removal of approxisately 299 bp in the 3' viral LTR 
sequence (Gilboa E., Eclitis P. A., Xantoff P.W., Anderson 
w.F. Biotechniques 1, 504-512 (198S)). 

Thus preferably the retroviral shuttle vectors of the 
present invention are SIN vectors. 



Since the parental retroviral czg., S2l, and env genes 
have been removed frca these shurtle vectors, a helper 
virus system may be utilised to provide the cac, doI, and 
Sflv retroviral gene products in trans to package or 
encapsidate the retroviral vector inzo an infective virion. 
15 This is accomplished by utilising specialised "packaging" 
cell lines, which are capable of generating infectious, 
synthetic virus yet are deficient in the ability to produce 
any detectable wild-type virus. In this way the artificial 
synthetic virus contains a chiaaera of the present 
20 invention packaged into synthetic artificial infectious 
virions free of wild-type helper virus. This is based on 
the fact that the helper virus that is stably integrated 
into the packaging cell contains the viral structural 
genes, but is lacking the psi-site, a cis-acting regulatory 
25 sequence which must be contained in the viral genomic RNA 
molecule for it to be encapsidated into an infectious viral 
particle. 



Accordingly, in a still further aspect of the present 
invention, there is provided an infective virion comprising 
a retroviral shuttle vector, as hereinbefore described, 
said vector being encapsidated within viral proteins to 
create an artificial, infective, replication-defective, 
retrovirus. 

In a another aspect of the present invention there is 
provided a method for producing infective virions of the 
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20 



present invention by delivering the artificial retroviral 
shuttle vector co]apri3ir.g a molecular chisaera of the 
invention, as hereinbefsre described, into a packaging cell 



line. 



The packaging cell line may have stably integrated 
vithin it a helper viros lacking a psi-sits and other 
regulatory sequence, as hereinbefore described, or, 
alternatively, the packaging cell line may be engineered so 
as to contain helper virus, structural genes vithin its 
genome, m addition to reaoval of the psi-sits, additional 
alterations can be made to the helper virus LTS regulator-/ 
sequences to ensure that the helper virus is net packaged 
in virions and is blocked at the level of reverse 
transcription and viral integration . Alternatively, helper 
virus structural genes (i.e., oaa, ooi; and snv) may be 
ixidividually and independently transferred into the 
packaging cell line, since these viral structural genes 
are separated within the packaging cell's genoae, there is 
little chance of covert recombinations generating wild-type 
virus. 



The present invention also provides a packaging cell 
line comprising an infective virion, as described 
25 hereinbefore, said virion further comprising a retroviral 
shuttle vector. 

The present invention further provides for a packaging 
cell line comprising a retroviral shuttle vector as 
30 described hereinbefore. 

In addition to retroviral-mediated gene delivery of 
the chimeric, artificial, therapeutic gene, other gene 
delivery systems known to those skilled in the art cai) be 
35 used in accordance with the present invention. These other 
gene delivery systems include other viral gene delivery 
systems known in the art, such as the adenovirus delivery 
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. systems. 

Non-viral delivery systems can be urilized in 
accordance with the present invention as well. For 
example, liposomal delivery systems can deliver the 
therapeutic gene to the tumor site via a liposome. 
Liposomes can be modified to evade metabolisa and/or to 
have distinct targeting mechanisms associated vith them. 
For example, liposomes vhich have antibodies incorporated 
into their structure, such- as antibodies to CZA, can have 
targeting ability to C2A-positive cells. This will 
increase both the selectivity of the present invention as 
well as its ability to treat disseminated disease 
(metastasis) . 



Another gene delivery system which can be utilized 
according to the present invention is receptor-mediated 
delivery, wherein the gene of choice is incorporated into 
a ligand which recognizes a specific cell receptor. This 
system can also deliver the gene to a specific cell type. 
Additional modifications can be made to this receptor- 
mediated delivery system, such as incorporation of 
adenovirus components to the gene so that the gene is not 
degraded by the cellular lysosomal compartment after 
25 internalization by the receptor. 

The infective virion or the packaging cell line 
according to the invention may be formulated by techniques 
well known in the art and may be presented as a formulation 
(coBposition) with a pharmaceutically acceptable carrier 
therefor. Pharmaceutically acceptable carriers, in this 
instance physiologic aqueous solutions, may comprise liquid 
medium suitable for use as vehicles to introduce the 
infective virion into a host. An example of such a carrier 
is saline. The infective virion or packaging cell line may 
be a solution or suspension in such a vehicle. Stabilizers 
and antioxidants and/or other excipients may also be 
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present in such pharaacautical foraulations (ccsjositions) , 
which may be administersd to a nasaaal by any conventional 
method (e.g., oral or parenteral routes). In particular, 
the infective virion may be administered by intra-venous or 
intra-arterial infusicr.. In the case of treating hepatic 
metastatic CRC, intra-hepatic arterial infusion may . be 
advantageous. The packaging cell line can be administered 
directly to the tumor cr .near the tumor and thereby produce 
infective virions direcrly at or near the tuner site. 



Accordingly, the present invention provides a 
pharmaceutical formulation (composition) ccsprising an 
infective virion or packaging cell line according to the 
invention in admixture vith a pharmaceutical! v a^,.««^.=.Kie 
15 carrxer. 

Additionally, the present invention provides methods 
of making pharmaceutical formulations (compositions) , as 
herein described, comprising mixing an artificial infective 
virion, containing a molecular chimaera according to the 
invention as described hereinbefore, with a 
pharmaceutically acceptable carrier. 

The present invention also provides methods of making 
25 pharmaceutical formulations (expositions), as herein 
described, comprising mixing a packaging cell line, 
containing an infective virion according to the invention 
as described hereinbefore, with a pharmaceutically 
acceptedjle carrier. 



Although any suitable compound that can be selectively 
converted to a cytotoxic or cytotostatic metabolite by the 
enzyme cytosine deaminase may be utilised, the preferred 
compound for use according to the invention is 5-FC^ in 
particular for use in treating colorectal carcinoma (CHC) , 
metastatic colorectal carcinoma, or hepatic CRC metastases! 
5-FC, which is non-toxic and is used as an antifungal, is 
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converted by CD into' the established cancer therapeutic 5- 
FU. 



Any agent that can potentiate the antituncr effects of 
5-FU can also potentiate the antituaor effects of 5-rc 
since, when used according to the present invention, 5-fc 
is selectively converted to 5-FU. According to another 
aspect of the present invention, agents such as leucovorin 
and levenisol, which car. potentiate the antir^or effects 
of. 5-FU, can also be used in . combination with 5-FC when 5- 
FC is used according to the present invenrion. Other 
agents which can potentiate the antitumor effects of 5-FU 
are agents which block the metabolism 5-FU. Ixamples of 
such agents are 5-suhs-ituted uracil derivatives, for 
15 example, 5-ethynyluracil and 5-brc2vinyluracil 
(PCT/GB91/01650 (WO 92/04901) ; Cancer Research 4i, 1094, 
(1986) which are incorporated herein by refererxe in their 
entirety) . Therefore, a further aspect of the present 
invention is the use • of an agent which can potentiate the 
20 antitumor effects of 5-FU, for example, a 5-substituted 
uracil derivative such as 5-ethynyluracil or 5- 
bromvinyluracil in combination with 5-FC when 5-FC is used 
according to the present invention. The present invention 
further includes the use of agents which are metabolised in 
25 viva to the corresponding 5-s\ibstituted uracil derivatives 
described hereinbefore (see Biochemical Pharmacology 38, 
2885, (1989) which is incorporated herein by reference in 
its entirety) in combination with S-FC when 5-FC is used 
according to the present invention. 



30 



35 



5-FC is readily available (e.g.. United States 
Biochemical, Sigma) and well known in the art. Leucovorin 
and levemisol are also readily available and well known in 
the art. 

Two significant advantages of the enzyme/ prodrug 
combination of cytosine deaminase/5-fluorocytosine and 
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further aspects of the invention are the following: 

1. The metabolic conversion of 5-FC by CD tjrsduces 5-fT; 
Which is the drug of choice in the treatment of aanv 
different types of cancers, such as colorectal carcinoma." 

2. The 5-?U that is selectively produced in one cancer 
cell can diffuse cut cf that cell and be taken up by botl- 
non-facilitated diffusicn and facilitated diffusion int-* 
adjacent cells. This produces a neighbouring cell killing 
effect. This neighbour cell killing effect ailsviat-s th» 
necessity for delivery cf the therapeutic molecular ch^mer-" 
to every tumor cell. Rather, delivery of tie molecular 
chimera to a certain perce.ntage of tumor cells can oroduca 
the complete eradication of all tumor cells. 



The amounts and precise regimen in treating a mammal 
will Of course be the responsibility of the attendant 
physician, and will depend on a number of factors including 
the type and severity of the condition to be treated 
However, for hepatic metastatic CRC, an intrahepatic 
arterial infusion of the artificial infective virion at a 
titer Of between 2 x 10'" and 2 x 10^ colony forming units 
per ml (CFU/ml) infective virions is suitable for "a typical 
25 tumour. Total amount of virions infused will be dependent 
on tumour size and are preferably given in divided doses. 

Likewise, the packaging cell line is administered 
directly to a tumor in an amount of between 2 x 10^ and 2 x 
30 10 cells. Total amount of packaging cell li„e infused will 
be dependent on tumour size and is preferably given in 
divided doses. 



35 



Prodrug treatment - Subsequent to infection with the 
infective virion, certain cytosine compounds (urodrugs: of 
5-FU) are converted by CD . to cytoxic or cytostatic 
metabolites (e.g. 5-FC is converted to 5-FO) in target 
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cells. The above aentioned prodrug ccapounds are 
adffiinis terse to the host (e.g. mammal or human) between six 
hours and ten days, preferably between one and five days, 
after administration of the infective virion. 

5 

The dose of 5-FC to be given will advantageously be in 
the range 10 to 500 mg per kg body weight of recipient per 
day, preferably 50 to 500 mg per kg bodyweight of recipient 
per day, mere preferably 50 to 250 mg per kg bodyweight of 

10 recipient per day, and most .preferably 50 to 150 mg per kg 
body weigh- of recipient per day. The mode of 
administra-ion of S-FC in humans are well known to those 
skilled in the art. Oral administration and/or constant 
intravenous infusion of 5-FC are anticipated by the instant 

15 invention to be preferable. 

The doses and mode of administration of leucovorin and 
levemisol to be used in accordance with the present 
invention are well known or readily determined by those 
clinicians skilled in the art of oncology. 



20 



The dose and mode of administration of the 5- 
substituted uracil derivatives can be determined by the 
skilled oncologist. Preferably, these derivatives are 
25 given by intravenous injection or orally at a dose of 
between 0.01 to 50 mg per kg body weight of the recipient 
per day, particularly 0.01 to 10 mg per kg body weight per 
day, and more preferably 0.01 to 0.4 mg per kg bodyweight 
per day depending on the derivative used. An alternative 
preferred administration regime is 0.5 to lo mg per kg body 
weight of recipient once per week. 



30 



The following examples serve to illustrate the present 
invention but should not be construed as a limitation 
35 thereof, in the Examples reference is made to the Figxires 
a brief description of which is as follows: 
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Figure i: Diacraa of CZA phage clones. The 
overlapping clones laabdaCEAi, lambdaCEA?, and lambdaCEAS 
represent an approxiiaaraly 26 .to region of ca genomic 
sequence. The 11,288 hp Hindlll-SauSA fragaent that was 
5 sequenced is represented by "the heavy line under 
lanbdaCEAl. The 3774 b? Hindlll-Hindlll fragaent that was 
sequenced is represented by the heavy line under 
laffibdaCEA7. The bent arrows represent the transcription 
start point for CEA mpj??.. The straight arrcvs reoresent 
10 the oligonucleotides CS15 and CR16. H, Hindlll; S. SstI; 
3, BaaiHI; E, EcoRI; X, iCbal. 



15 



20 



25 



Figure 2: Restricrion map of part of laarcaCEAi. The 
-arrow head represents the approximate location of the 
rranscription imitation point for CEA mRNA. Unes below 
the map represent the CZA inserts of pBS+ subclones. These 
subclones are convenient sources for numerous CEA 
restriction fragments. 

DNA sequence of the 11,288 bp Hindlll to 
Sau3A fragment of lambdaCEA? (SEQ ID NO; 1). 'sequence is 
numbered with the approximate transcription imitation point 
for CEA mRNA as 0 (this start site is approximate because 
there is some slight variability in the start site among 
individual CEA transcripts) . The translation of the first 
ew>n is shown. Intron i extends from +172 to beyond +592. 
Several restriction sites are shown above the sequence. In 
subclone 109-3 the sequence at positions +70 has been 
altered by site-directed mutagenesis in order to introduce 
Hindlll and EcoRI restriction sites. 

DNA sequence of the 3774 bp Hind III to 
Hindlll fragment of lambda CEA7 (SEQ ID NO: 2). 



Figure 3 : Mapplot of 15,056 bp Hindlll to Sau3A 
fragment from CEA genomic DNA showing consensus sequences. 
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Schematic rspresentaticn of some of the consensus sequences 
found in the CEA sequence of seq IDs i and 2. The 
consensus sequences shewn here are from the transcriptional 
dictionary of Locker and Buzard (DNA Sequence x., 3-li 
5 (1990)). The lysozynai silencer is coded 313. The lasr 
line represents 90% hcaology to the topoisomerase li 
cleavage consensus. 

Figure 4: Cloning scheme for CEA constracrs extending 
10 from -299 bp to +69 bp. 

Figure 5A: Clcning scheme for CZA constructs 
extending from -10.7 to +69 bp. 

15 Figure 5B: Coordinates for CEA sequence present in 

several CEA/ lucif erase clones. CEA sequences were cloned 
into the multiple cloning region of pGL2 -Basic (Promega 
Corp.) by standard techniques. 



Figures SC and 5D: Transient lucif erase assays. 
Transient transfections and luciferase assays were 
performed in quadruplicate by standard techniques using 
DOTAP (Boehringer Mannheim, Indianapolis, IN, USA), 

25 luciferase assay system (Promega, Madison, WI, USA), and 
Dynatech luminometer (Chantilly, VA, USA) . CEA-positive 
cell lines included LoVo (ATCC #CCL 229) and SW1463 (ATCC 
#CCL 234) . CEA-negative cell lines included HuH7 and Hep3B 
(ATCC #HB 8064). C. Luciferase activity expressed as the 

30 percent of pGL2-Control plasmid activity. D. Luciferase 
activities of LoVo and SW1463 expressed as fold increase 
over activity in Hep3B. 

Example 1 

35 

Construction of transcriptional r egulatory sequence of 
mrc^pqe^byvonAc — antigen/cvtosine d* >aminase molecular 
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chimaera 
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smrngf o' n, c.r.i „^,„„ ^^ ^^ 

th. "-i isolated fron 

,t,e'"tv TTT " 

^search, ^, 15,1-1527 ,»,o, «hi=;, is haraln incorporatac 

^Ji^ ? " ' ^i""" "«e 

iae„tif.e= by pU,^e hybri«»ti=„ to »p «^-i.beUe- 

=Ugo„u=l«tldes ,„a cR16. CS15 

Uttle homology to other meters of th, CEi ,e„e t^ixy 
DKA i,olat.a fro. thr.. clones that h^hridi.e^t 

Liri "^i^ad 

, '""^^ "^"<=«- The 

Jiree Clones are deslgnatwi laaidacEii, laibdacms end 

la^acz., and have inserts of approximately 1 T'l.T 

a^ 1..7 ^ respectively. . partial restri Jon ^p' o^;^; 

three overlapping clone? is ,ho«n in Flgmre 1. 

Clone laBbdaCEAl was initially chosen for ert««ive 
a«lysiB. mgments isolated from la^tdacai subcloned 
uamg standard techniques into the plas^d pBs. (stratag^ 
Clonang sy,te»s, joUa, CA, usA, to facilll"* 

^i^. Site-directed mutagenesis, and construction ^f 

HlndlII/sau3; reAr!«lorLZ,tTrra^^'f° ^ 

nether " -^""^'"S '=y the dideoxy sequez^cina 

"ell '''"^ '"'^ "^"^'^^ >^^a 

^ohnoiogies, inc. and nultiple oligonucleotide priMers. 
nus sequence «ctend, fro. -10.7 icb to *o.6 kb relative to 
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the start site of CEA zSlJk. The sequence of 3774 base 
pair Hindlll restriction fragment frca lambdaCZAl was also 
determined ( SZQ ID NO: 2) . This sequence extends 

from -14 . 5 kb to -10 . 7 ka relative to the start site of CEA 
5 aKrA. This Hindlll fragment is present in plasaid pCRi45. 

To determine important transcriptional regulatory 
sequences various fragments of CEA genomic DNA are linked 
to a reporter gene such as luciferase or chloramphenicol 
.10 acetyltransf erase. Various firagments of CEA genomic DNA are 
tested to determine the optimized, cell-type specific TRS 
that results in high level reporter gene expression in CEA- 
positive cells but not in CEA-negative cells. The various 
reporter constructs, along with appropriate controls, are 
15 transfected into tissue culture cell lines that ej^ress 
high, low, or no CEA. The reporter gene analysis identifies 
both positive and negative transcriptional regulatory 
sequences. The optimized CEA-specific TRS is identified 
through the reporter gene analysis and is used to 
20 specifically direct the expression of any desired linked 
coding sequence, such as CD or vzv tk, in cancerous cells 
that express CEA. The optimized CEA-specific TRS, as used 
herein, refers to any DNA construct that directs suitably 
high levels of expression in CEA positive cells and low or 
25 no expression in CEA-negative cells. The optimized CEA- 
specific TRS consists of one or several different fragments 
of CEA genomic sequence or multimers of selected sequences 
that are linked together by standard recombinant DNA 
techniques. It will be appreciated by those skilled in the 
30 art that the optimized CEA-specific TRS may also include 
some sequences that are liot derived from the CEA genomic 
sequences shown in seq IDs -i- and 2. These other sequences 
may include sequences from adjoining regions of the CEA 
locus, such as sequences from the introns, or sequences 
35 further upstream or downstream from the sequenced DNA shown 
in Seq IDs 1 and 2\ or they could include transcriptional 
control elements from other genes that when linked to 
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selected cZi sequences result in the desired CZA-specific 
regulation. 

The CZA sequence of seq IDs 1 and 2 vere comouter 
analyzed f=r characterized consensus sequences which'hav* 
been associated with gene regulation. Currently not enough 
IS known about transcriptional regulatory secuences t- 
accurately predict by sequence alone whether a seouend 
vill be functional. However, computer searches" fc- 
Characterized consensus sequences can help identifv 
transcriptional regulatory sequences in uncharacterize^ 
sequences since many enhancers and promoters consist o- 
unique combinations and spatial alignments of several 
caaracterizsd consensus sequences as well as other 
sequences. since not all transcriptional regulatory 
sequences have been ide.ntified and not all secuences that 
are identical to characterized consensus sequences are 
functional, such a computer analysis can only suggest 
possible regions of DNA that may be functionally important 
0 for gene regulation. 

Some examples of the consensus sequences that are 
present in the CZA sequence • ^^^^ ^ 

Figure 3 . Four copies of a lysozymal silencer consensus 
sequences have been found in the CEA sequence. Inclusion of 
one or more copies of this consensus sequence in the 
aolecular chimera can help optimize CEA-specific 
e:q.ression. a cluster of topoisomerase ii cleavage 
consensus identified approximately 4-5 kb upstream of the 
CEA transcriptional start suggest that this region of CZA 
sequence may contain important transcriptional regulatory 
signals that may help optimize CEA-specific expression. 

The first fragment of CEA genomic sequence analyzed for 
transcriptional activity extends from -299 to +69, bui it 
IS appreciated by those skilled in the art that otiier 
fragments are tested in order to isolate a TRS that directs 
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strong exprsssion i- CEA-positive cells but little 
expression in C2A-negative cells. As diagramaed in Figurk 
4 the 943 bp Smal-Hindlll fragment of plasmid 39-5-5 was 
subcloned into the Saai-Hindlll sites of vector pBS+ 
(Statagene Cloning Systems ) creating plasaid 96-11, 
Single-stranded DNA was rescued fron cultures of XLl-blue 
96-11 using an M13 helper virus by standard techniques. 
Oligonucleotide CR70, 5'- 
CCTGGAACTC\AGCTTGAATTCrCCACAGAGGAGG-3' (SEQ ID NO: 5), was 
used as a priaer for oligonucleotide-directed autagenesis 
to introduce Eindlll and EcoRI restriction sites at +65. 
Clone 109-3 was isolated from the autagenesis reaction and 
was verified by restriction and DNA sequence analysis to 
contain the desired changes in the DNA sequence. CEA 
genomic sequences -299 to +69, original" numbering Figure 3, 
were isolated from 109-3 as a 381 bp EcoRI/Hindlll 
fragment. Plasaid pRc/OiV (Invitrogen Corporation, San 
Diego, CA, USA) was restricted with Aatll and Hindlll and 
the 4.5 kb fragment was isolated from low melting point 
20 agarose by standard techniques. The 4.5 kb fragment of 
pRc/CKV was ligated to the 381 bp fragment of 109-3 using 
T4 DNA ligase. During this ligation the compatible Hindlll 
ends of the two different restriction fragments were 
ligated. Subsequently the ligation reaction was 
25 supplemented with the four deoxynucleotides , dATP, dCTP, 
dGTP, and dTTP, and T4 DNA polymerase in order to blunt the 
non-compatible Aatll and EcoRI ends. After incubating, 
phenol extracting, and ethanol precipitating the reaction^ 
the DMAs were again incubated with T4 DNA ligase. The 
resulting plasmid, pCR92, allows the insertion of any 
desired coding sequence into the unique Hindlll site 
downstream of the CEA TSS, upstream from a polyadenylation 
site and linked to a dominant selectable marker. The 
coding sequence for CD or other desirable effector or 
35 reporter gene, when inserted in the correct orientation 
into the Hindlll site, are transcriptionally regulated by 
the CEA sequences and are preferably expressed in cells 
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tiat express CZA but .net in cells that do not express CEA. 

In crssr to determine the optimized cza tss other 
reporter gene constructs containing various fragments of 
5 CSA genomic sequences are made by standard techniques from 
DKA isolated from any of the CEA genomic clones (Figures 1, 
2, 4, and 5). DNA fragments extending from the Hindlli 
site introduced at position +65 (original numbering Figur» 
3A) and numerous different upstrsaa sites are isolated and 
10 Cloned into the unique.. Hindlli site Ln olasmid 
pSVOALdeltaS' (De Wet, j.r:, e£ ^. jjol. Cell. Biol., 7, 
725-737 (1S87) which is herein incorporated by reference iL 
its entirety) or any similar reporter gene olasmid to' 
construct luciferase reporter gene constructs, Ficures ^ 
and 5. These and similar constructs are used in transient 
expression assays performed in several CEA-positive and 
CEA-negative cell lines to determine a strong, CEA-positive 
cell-type specific TRS. Figures 53, 5C, and 5D show the 
results obtained from several CEA/ luciferase reDorter 
20 constructs. The optimized TRS is used to regulate the 
expression of CD or other desirable gene in a cell-type 
specific pattern in order to be able to specifically kill 
cancer cells. The desirable expression cassette is added 
to a retroviral shuttle vector to aid in delivery of the 
25 expression cassette to cancerous tissue. 

Strains containing plasmids 39-5-5 and 39-5-2 were 
deposited at the ATCC under the Budapest Treaty with 
Accession No. 68904 and 68905, respectively. A strain 
containing plasmid pCR92 was deposited with the ATCC under 
the Budapest Treaty with Accession No. 68914. A strain 
containing plasmid pCRl45 was deposited at the ATCC under 
the Budapest Treaty with Accession No., 69460. 

U Cloning and isolahioT 

cvtosine deaminase (CD) 
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The cloning, sequencing and expression of E. coli CD 
has already been published (Austin & Huber, Molecular 
Pharmacolccy, 43, 380 - 337 (1993) the disclosure of which 
is incorporated herein by reference) • A positive genetic 
5 selection was designed fcr the cloning of the ccdA gene 
from E. coli. The selection took advantage of the fact 
that E. cell is only able to metabolize cytcsine via CD. 
Based on this, an E. coli strain was constructed that could 
only utilize cyzosine as a pyrimidine source when cytosine 
10 deaminase was provided in ..trans. This strain, BAiOi, 
contains a deletion of the codAB operon and a aiutation in 
the pyrF gene. The szrain was created by transducing a 
pyrF mutation (obtained from the E. coll strain X32 (E. 
coll Genetic stock Center, New Haven, CT, USA)) into the 
15 strain MBM7007 (W. Dallas, Burroughs Wellcome Co., NC, USA) 
which carried a deletion of the chromosome from lac to 
argrF . The pyrF mutation confers a pyrimidine requirement 
on the strain, BAIOI. In addition, the strain is unable to 
metabolize cytosine due to the codAB deletion. Thus, BAlOl 
20 is able to grow on mininal medixim supplemented with uracil 
but is unable to utilize cytosine as the sole pyrimidine 
source. 

The construction of BAlOl provided a means for 
25 positive selection of DNA fragments encoding. The strain, 
BAlOl, was tramsformed with plasmids carrying inserts from 
the E. coli chromosome and the transformants were selected 
for growth on minimal medium supplemented with cytosine. 
Using tills approach, the transformants were screened for 
30 the ability to metabolize cytosine indicating the presence 
of a DNA fragment encoding CD. Several sources of DNA 
could be used for the cloning of the codA gene: i) a 
library of the coli chromosome could be purchased 
commercially (for example from Clontech, Palo Alto, CA, USA 
35 or Stratagene, La Jolla, CA, USA) and screened; 2) 
chromosomal DMA' could be isolated from E. coli, digested 
with various restriction enzymes and ligated and plasmid 
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SNA with caapatible er.ds before screening; and/or 3J 
bacteriophage lajnhda clones containing mapped E. coli 
chromosomal DNA inserts could be screened. 

5 ^ Bacteriophage larida clones [Y. Kohara, National 
Institute of Genetics, Japan) containing DNA inserts 
spanning the 6-8 minura region of the E. coli chromosome 
vere screened for the ability to provide transient 
complementation of the cocA defect. Two clones, 137 and 
10 138 were identified Ln\.this manner. Large-scale 
preparations of DNA frca these clones were isolated frcn 
500 ml cult^yres. Restriction enzymes were used to generate 
SNA fragme.'its ranging in size from 10-12 kilobases. The 

._ „s^= iu;,--^, icoiu. ana saaJH, and TcoRl and 

15 £in<?Ili. DNA fragments of the desired size vere isolated 
from preparative agarose gels by electroelution. The 
isolated fragments were ligated to pBR322 (Gibco BRL, 
Gaithersburg, MD, USA) with compatible ends. The resulting 
ligation reactions were used to transform the E. coli 
20 strain, DH5a (Gibco BRL, Gaithersburg, MD, USA). This step 
was used to amplify the recombinant plasmids resulting froi 
the ligation reactions. The plasmid DNA preparations 
isolated from the ampicillin-resistant DHSo transformants 
were digested with the appropriate restriction enzymes to 
verify the presence of insert DNA. The isolated plasmid 
DNA was used to transform BAIOI. The transformed cells 
were selected for resistance to ampicillin and for the 
ability to metabolize cytosine. Two clones were isolated 
pEAOOl and PEA002. The plasmid pEAOOl contains an 
approximately lo.s kb ZcoRI-BamHI insert while pEA002 
contains an approximately 11.5 kb ScoRl-ffindlii insert. 
The isolated plasmids were used to transform BAlOi to 
ensure that the ability to metabolize cytosine was the 
result of the plasmid and not due to a spontaneous 
35 chromosomal mutation. 



A physical map of the pEAOOl DNA insert was generated 
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using restriction enzyses. Deletion derivatives of pEAOOl 
were constructed based on this restriction aap. The 
resulting plasaids were screened for the ability to allow 
BAlOi to aerabolize cyrosine. Using this approach, the 
codA gene was localized to a 4.8 kb ZcoRI-Sglll fragment. 
The presence of codA within these inserts was verified by 
enzymatic assays for CD activity. in addition, cell 
extracts prepared for eazynatic assay were also examined bv 
polyacrylaaide gel electrophoresis. Cell extracts tliat 
were positive for enzyaatic. activity also had a protein 
band migrating with an apparent molecular weigl-z of 52,000. 



The DNA sequence of both strands was determined for a 
15 1634 bp fragment. The sequence determination began at the 
Pstl site and extended to PvtiII site thus including the 
codA coding domain. An open reading frame of 1283 
nucleotides was identified. The thirty amino terminal 
amino acids were confirmed by protein sequencing. 
Additional internal amino acid sequences were generated 
from CNBr-digestion of gel-purified CD. 



A 200 bp Pstl fragment was isolated that spanned the 
translational start codon of codA. This fragment was 

25 cloned into pBS*. Single-stranded DNA was isolated from 30 
ml culture and nntanized using a custom oligonuclotide BA22 
purchased from Synthecell Inc. , Roc]cville, MD, USA and the 
oligonucleotide-directed mutagenesis kit (Amersham, 
Arlington Heights, IL, USA). The base changes result in 

30 the introduction of an Eindlll restriction enzyme site for 
joining of CD with CEA TRS and in a translational start 
codon of ATG rather than GTG. The resulting 90 bp Sindlll- 
Pstl fragment is isolated and ligated with the remainder of 
the cytosine deaminase gene. The chimeric CEA TRS/cytoSine 

35 deaminase gene is created by ligating the aindlll-PvuII 
cytosine deaminase-containing DNA fragment with the CEA TRS 
sequences . 
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15 



The strain BAlOl and the plasaids, pEAOOi and pEA003, 
vere depcsirad with ATCC under the Budapest Treaty with 
Accession Ifcs. 53299, SS916-, and 63915 respectively. 

. C) const -^on of tr^-crriptionai >-^ m ilator-. c.^,^ 

■ Carcinoe7nh-/nrV ^nt;.ran/c^/to^ine d^^^Sn.c. V ^-l^^-in- 
ChimeT-a 

A 1503 i? Hindlll/r'/ull fragment containing the coding 
seguence for cytosine deijjinase is isolated from the 
plasmid containing the full length CD gene of SxaaDle 13 
that has been altered to contain a Hindlll restriction site 
just 5' of -the initiation codon. Plasmid pC592 contains 
— ,-w,.w« t3 Toy immeciiately 5' to a unique 

Hindlll restriction site and a polyadenylation signal 3' to 
a unique A?al restriction site (Example lA, Figure 4). 
PCR92 is ILnearised with Apal, the ends are blunted using 
dNTPs and T4 DNA polymerase, and subsequently digested with 
Hindlll. The PCR92 Hindlll/Apal fragment is ligated to the 
20 1508 bp Hindlll/Pvull fragment containing cytosine 
deaminase. Plasmid pCEA-i/codA, containing CD inserted in 
the appropriate orientation relative to the cea trs and 
polyadenylation signal is identified by restriction enzyme 
and DNA sequence euialysis. 

The optimized CEA-specific TRS, the coding sequence 
for CD with an ATG translation start, and a suitable 
polyadenylation signal are joined together using standard 
molecular biology techniques. The resulting plasmid, 
containing CD inserted in the appropriate orientation 
relative to the optimized CEA specific TRS and a 
polyadenylation signal is identified by restriction enzyme 
and DNA sequence analysis. 

35 Example ? 



25 



30 



ffqns1;rV<?1:1pn of a rPt>ovtral ,..h»i-M^ v^^„^ ^^^^ 
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contain ing the aolecular chimera of Example i 

The retroviral shuttle vector pL-CZA-i/codA is constructed 
by ligatinc a suitable restriction fragment containing the 
optimized CZA TRS/codA molecular chimera including the 
polyadenylation signal into an appropriate retroviral 
shuttle vector, such as N2(XM5) linearised at the Xhol 
site, using standard molecular biology techniques. The 
retroviral shuttle vector pL-CEA-l/codA is characterized by 
restriction endonuclease mapping and partial DHA 
sequencing. 

ZxaaplQ 3 

Virus Prod-jction of Retroviral Constructs of Zyganle 

The retroviral shuttle construct described in Example 2 is 
placed into an appropriate packaging cell line, such as 
PA317, by electroporation or infection. Drug resistant 
colonies, such as those resistant to G418 when using 
shuttle vectors containing the NEO gene, are single cell 
cloned by the limiting dilution method, analyzed by 
Southern blots, and titred in NIH 3T3 cells to identify the 
highest producer of full-length virus. 

Example 4 

Further data on the CEA TRS 

In addition to the plasnids shown in figure 5B, the 
following combinations of regions have proved peirticularly 
advantageous at high level expression of the reporter gene 
in the system described in Example lA: 
PCR177: 

(-14.5kb to -lO.eXb) + (-e.lkb to -3.9kb) + (-299b to +69b) 
PCR176: 

(-13.6kb to -lO.Skb) + {-o.lkb to -3.9kb) + (-299b to +69b) 
PCR165: 

(-3.9kb to -e.lkb) + (4X -90b to +69b) 
PCR168: 

(-13.6kb to -10.6kb) + (4x -90b to +69b) . 
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SIQUSNCE LISTING 



(1) GSSSAL INFORianON: 

(i) APPLICANT: 

(A) NA.M2: The Weliccme Tcundation Limited 

(3) ST^ZXT: Unicorn House, 160 Euston 3oad 

(C) CITY: London 

(E) COUNT?.Y: G.3. 

(P) POSTAL CODE (ZIP): Nrfl 2BP 

(ii) 7ITL2 0? ::n;z:moN: Transcriptional Regulatory Sequence 
(iii) NUMZR OF SIQCSNCZS: 5 

(iv) C0MPUT2H ilZADA3L2 FORM: 

•(A) MSDIUH TYPE: Flooov disk 

(B) COMPOTTR: I3M PC* ccapatible 

(C) OPHSATING S?ST2M: PC-30S/2{S-D0S 

(D) SorrrfASZ: Patentin Release #1.0, Version #1,25 (EPO) 

(2) nJFORMATION ?CR*S2Q 10 NO: 1: 

(i) SEQUENCE CEARACTSRISTICS: 

(A) LENGTH: 11288 base oairs 

(B) TYPE: nucleic acid " 

(C) STRAND2DNZSS: single 

(D) TOPOLOGY: linear 

(ii) M0LECUL2 TTPS; DNA (gencmic) 
(iii) HypOTEETICAL: NO 
(iii) ANTI-SENSE: NO 

(V) FRAGMENT TTPS: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: Is 
AAGCTTAAAA CCCAATGGAT TGACAACMC AAGaCTTCGA ACAAGTGGAC ATGCacaTCT 
XACTTGTCGA AATTTAGATG TCTTCAGCTA TCSCGCAfiGA GAMCreTGT CAMIiaaG 
CATGGTTCAG AAGAATCAAA AAGT6TCACA CXCCAAATGT GCAACaCTGC A6G66AZAAA 
ACTGTGGTGC ATTCAAACTC AGGGATaTTT TCGAACATG& 6AAA66AA66 GATTGCIGCr 
GCACAGAACA TGGATGATCT CACACATA6A GTTGaAAGAA ACCACTCAAT CCCAGAAtaG 
AAAATGAXCA CTAATTCCAC CTCTATAAAG TTTCCAAGAG GAAAACCCAA TTCTGCTGCT 
AGAGATCACA ATGGAGGTGA CCTGTGCCTT GCAATGGCTG TGA6GGTCAC G6GAGTGTCA 
CTTAGTGCAG GCAATGTGCC GTAICTTAAT CTG6GCAGGG CTTTCATGAG CACAIAG6AA 
TCCAGACATT ACTGCTGTGT TCATTTTACT TCACCG6AAA AGAAGAAIAA AATCAGCCG6 
TCGCGGTGGC TCACGCCTCT AATCCCAGCA CTTTAGAAGG CTGAGGTGGG CAGATTACTT 
GAGGTCAGGA GTTCAAGACC ACCCTGGCCA AIAXGGT6AA ACCCCGGCTC TACTAAAAAT 
ACAAAAATTA GCTGGGCATC GTCCTGCCCG CCTCTAATCC CAGCtACTCG GGAGGCTCAG 



60 

120 

180 

240 

300 

360 

420 

480 

540 

6Q0 
• 

660 
720 
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CCTCS&CaAT TCCTTG5ACC CAGGAACCAC AGSTTGCACT GACCCAAGAT TCTGCCACrC 780 

CACTCCAGCT IGGGCAACAG AGCCAGACTC rSTAAAAAAA AAAAAAAAAA AAAAAAAAAS 840 

AAAGAAAGAA AAAGAAAAGA AAGTATAAAA rCTCTTTGGG rTAACAAAAA AAGATCCACa 900 

AAACAAACaC CAGCTCTTAr CAAACTTACA CAACTCTGCC AGACAACAGG AAACACAAAT 960 

aCTCAT-AAC TCACTTTTG? GGCAATAAAA =Cr?CATGTC AAAAGGAGAC CAGGACACAA 1020 

TGAGGAAGTA AAAC7GCAGG CCCTACT7GG STGCAGAGAG GGAAAATCCA CAAA=AAAAC 1080 

AXTACCAGAA GGAGCTAAGA TTTACTGCAT rGAGTTCATT CCCSACGTAT GCAAG57GAT 1140 

TTTAACacCT GAAAACCAAr CArTGCCTT? ACTACaiAGA CAGArTAGCT AGAAAAAAAT 1200 

TACAACTA6C A6AACACAAG CAATTTGCCC ~CCTAAAAT TCCACATCAT ATCATCATCA 1260 

TGGAGACaGT GCSIGACSCCA ATGACAATAA AAAGAGGGAC CTCCSTCACC CGGTAAACAT 1320 

GTCCACACAG CTCCAGCAAG CACCCGTCTT CCCAGTGAAT CACTGTAACC TCCCCTTaA 1380 

TCAGCCCCAG GCAAGGCTGC CrCCGATGCC C?.CACAGGCr CCAACCCGTG GGCCTCAACC 1440 

TCCCCCAGAG GCTCTCCTr: GGCCACCCCA rSGGGAGAGC ATSAGGACAG GGCAGACCCC ISOO 

rCTGATGCCC ACACATGGCA GGAGCTGACC CCAGAGCCAT GGGGGCTGGA GAGCACACCT 1S60 

GCTGGGCTCA GAGCTTCCTG AGGACACCCA GGCCTAAGGG AAGGCAGCTC CCTGGATGGG 1620 

GGCAACCAGG CTCCGGGCTC CAACCTCAGA GCCCGCAXGG GAGGAGCCAG CACXraiGGC 1680 

CTTTCCTAGG GTGAC7CTGA GGGGACCXTG ACaCGACAGG AXCCCTGAAT GCACCCGACA 1740 

TGAAGGGGCC ACCACGGGAC CCTGCTCTCC rOGCAGATCA GGAGAGAGTO GGACACCAIG 1800 

CCAGGCCCCC ATG6CAIGGC TGCGACTGAC CCAGGCCACT CCCCTCCATG CATCAGCCTC 1860 

GGTAAGTCAC ATGACCAAGC CCAGGACCAA TGTCGAAGCA AGGAAACAGC ATCCCCTTIA 1920 

GTGATGOAAC CCaAGGTCAC TGCAAASAGA OGCCaTCACC AGTTAGCAAG GGTGGTOaA 1980 

CCTACA6CAC AAACtaiCAT CIATCAXAAC lASAAGCCCT 6CTCCAT6AC CCCTCCATTT 2040 

AAAIAAACGT TTGTTaAAtG AGtCAAAITC CCXCACCATC ASAGCTCACC TGTGTCTaOG 2100 

CCCaiCaCAC ACACAAACAC ACACaCACAC ACACACACAC ACaCACACAC ACACAGCGAA 2160 

AOTGCACCAT CCIGCACA6C ACCAGGCACC CtTCACAOGC AGAGCAAACA CCGWAAIOA 2220 

CCCATGCAGT GCCCXGGGCC CXATCAGCTC AGAGACCCTG iraGCGCTGA GATGGG6CIA 2280 

GGCAGGCGAG AGACTTaGAG AGGGTGGGGC CTCCAGGGAG GGGGCIGCAG GGAGCT GCG T 2340 

ACTGCCCTCC AGGGAGGGG6 CTGCaGGGAG CTGGGTACTG CCCTCCAGG6 AGGCGGCTGC 2400 

AGGGAGCTGG GTACTGCCCT CCaGGGAGGG GGCTGCAGGG AGCTGGGTAC TGCCCTCCAG 2460 

GGACGGGGCT GCAGGGAGCT GCGTACTGCC CTCCAGGGAG GCAGGAGCAC TGTTCCCAAC 2520 

AGAGAGCACA TCTTCCTGCA GCAGCTGCAC ACACACAGGA GCCCCCATGA CTGCCCTGGG 2580 

CCAGGGTGTG GATTCCAAAT TTCGTGCCCC ATTGGGTGGG ACGCAGGTTG ACCGTGACAT 2640 

CCAAGGGGCA TCTGTGATTC CAAACTTAAA CTACTGTGCC TACAAAATAG GAAATAACCC 2700 

TACTTTTTCT ACTATCTCAA ATTCCCTAAG CACAAGCTAG CACCCITTAA ATCASGAACT 2760 
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TCAGTCaCTC CTGCSGrccr CCCATGCCCC SIG7CT6ACT TCC&CGTGCA CaCCCrCCCT 2820 

GAcarcroTc CTrocrccrc ctcttggctc aactgccgcc ccrcctscGc ctgactgatg mso 

GrcaG<acaA qggatcctag agctcccccc atgattgaca ggaacscaog actwscctc 2940 

CATTCTGAAG ACTAGCGGTG r«AGAGACC TGSGCATCCC ACACAGCTGC ACaACASGAC 3000 

GCCGACaCAG GSTGACAttG GGCTCAOGGC -CaCACGGS TC5SGAGCCT CAGCTSACaG 3060 

TTCaOGGACA GACCTGAGGA GCCTCAGTGG GiAAAGAAGC ACTGAAGTGG GAACrrCTGG 3120 

AAXG—CTGG ACAAGCCTSA GTGCTCTAaG GXAATGCTCC CACCCCGATG TAGCCTGCAG 3180 

CACTG5ACGG TCTGTG31CC TCCCC6CTGC CCArCCTCTC ACTnGCCCCCG CCTCTSGGGA 3240 

CACAACrcCT GCCCXAAtar GCiTCTTTCC 7GTCTCATTC CACACAAAAG GGCCTSTGGG 3300 

GTCCCTGitC TCCATTGC&A GGaGTGGAGS TCACGrrCCC ACAGACCACC CAGOJiaGG 3360 

STCCTATGCA GGTGCGGrCV GGAGGATCAC AC3TCCCCCC ATGCCCAGGG GACTC-nCTCT 3420 

GGGGGTG&IO GATTGGCCTG GAGGCCACTG CrCCCCTCTG TCCCTGAGGG GAAICTSCac 3480 

CCTGGACGCT GCCACArccc rcCTGATTCT r?CAGCTGAG GGC=— CT7 GAAArCCC&C 3S40 

GGAGGACrCA ACCCCCACTG GGAAAGCCCC AGIGIGGACG GTTCCACAGC AGCCaociA 3600 

AGGCCCrrCG ACACAGATCC TGAGTGAGAG .UCCITTAGG GACICAGGTG CACGGCCaiG 3660 

TCCCCAGTGC CCACACAGAG CAGGGGCATC rGGACCCTGA GT67GTACCT CCCGCSaCTC 3720 

AACCCSGCCC TTCCCCaATG ACGTGACCCC rGGGGTGGCT CCAGGTCTCC AGTC«SGCC 3780 

ACCaAAaiCT CCAGATTGAG GGTCCTCCCT TGaCTCCCTG ATCCCTGTCC AGGAGCTGCC 3840 

CCCTGAGCAA ATCTAGAGTG CaGAGCGCTG G5AITGT6GC AGXAJUIAGCA GCCACaSITC 3900 

TCTCAGGAAG GAAAGGGAGG ACMGAGCTC CSGCaaGCGC GaiGGCCTCC TCTACTSGOC 3960 

GCCTCCrCTT AAXGACCaaA AAGGCGCCaG GiGACTTCAS AGATCAGGGC TGGCCTTGGa 4020 

CTAAGGCrCA GATGGAGaGG ACTGAGGTGC AaacaGGGGG CTGAaGTACG GGACTCGTC8 4080 

GGACAGaiGS GAGGAGCMG TSaGGGGAAG CCCCaGGCAG GCCCCCCCAC CGTACSCCaG 4140 

AGCTCTCCaC TCCICaGCaT TGACMTTGG G6T6CTCCTC CTaCTGCeCT TCTGTaaCIT 4200 

etRGeGisiT cMCRccaTc tgoggactct AcccaciaaA TGaaGcacc actccctccc 426o 

caAocreaaA caAccaAcaA TsrcrccacA cmccaAAT cTccccresA cAccaaaair 4320 

eCTTCTGGCA GAATCACTGA TCIACCTC3WS ICTCTAAAAG TGACTCAtCA GCCaaaiCCT 4380 

TCACCTCTTG GGAGAACAAI C&«aCTGTG AGMGGGTA6 AAACTGCA6A CTTCaaAMC 4440 

TTTCcaaaac AGTrrracrx aatcaccagt ttgatgtccc aggagaagat acattsigag 4500 

TGTTTAGACT TGATGCCACA TGGCTGCCTG SACCTCACaG CAGGAGCAGA GTGGGTTTTC 4S60 

OAGGCCCTG TAACCACAAC TGGAATGACA CTCACTGGGT TACATTACAA AGTGGAATGT 4620 

GGGGAATTCT GTA6ACITTC G6AAGG6AAA T6TAT6AC6T GAGCCCACAG CCTAAGCCAC 4680 

TGGACAGTCC ACTTTGACGC TCTCACCATC TAGGAGACAT CTCACCCATG AACATAGCCa 4740 

CATCTGTCAT TAGAAAACAT CITTTATTAA GAGCAAAAAT CTACGCTAGA AGTGCTmi 4800 
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GCTCTrTTTT CTCTTTArC? tCAAATTCAT ArACCTTTAG ATCATTCCTT AAAGAACAAT 4860 

CTATCCCCCT AAGTAAArS? 7ATCACTGAC rGGATAGTCT TGGrGTCTCA CTCCCASiCCC 4920 

CTGTGTGGTG ACACTCCCCT GCTTCCCCAG CCCrSGGCCC TCTCTGATTC CTGAGACCT? 4980 

TGGG7CCTCC TTCATtAGCA GGAAGAGACG AAGGGTCTTT TrAATATTCT CACCArTCAC 5040 

CCATCCACCT CTTAGACACT GGGAAGAATC AGTtGCCCAC TCrTGGArTT GATCCTCSAA 5100 

TTAATCACCT CTATTTCrCt CCCTTGTCCA rtrCAACAAT GTGACACGCC TAAGAGGTCC • 5160 

CTTCtCCATG TGATTTTTSA GGAGAAGGTT CTCAAGATAA GtrrTCTCAC ACCTCrTTGA 5220 

ATTACCrCCA CCTCTCrCCC CATCACCATT ACCAGCAGCA TTTCGACCCT TTTrCTGITA 5280 

GTCAGATGCT TTCCACCrC? TGAGGGTGTA TACTCTATGC TCTCTACACA GGAArASGCA 5340 

GAGGAAAiAG AAAAACGGAA ATCGCATTAC ZATTCAGAGA GAAGAAGACC TTTAtGrCAA 5400 

TGAATGAGAG TCTAAAATCC TAAGAGA6CC CATAtAAAAT TATTACCAGT GCTAAAACTA 5460 

CAAAAGTtAC ACTAACAGTA AACTAGAATA AIAAAACATG CATCACAGTT GCTCGTAAAG 5520 

CTAAArCAGA TATTTTTTTC TTAGAAAAAG CATTCCATGI GTGTTGCAGT GATGACAGGA 5580 

GTGCCCrrCA GXCAATArCC TGCCTGTAAT rCTrCTTCCC TGGCAGAATG TATrGTCTT? 5640 

TCTCCCTTTA AATCTTAAAT GCAAAACTAA ACGCAGCTCC TGGCCCCCCT CCCCAAAGTC 5700 

AGCTGCCTGC AACCACCCCC ACGAAGAGCA GAGGCCTGAG CTrCCCTGGT CAAAATAGGG 5760 

GGCTAGGGAG CTTAACCTTG CTCGATAAAG CTGTGTTCCC AGAATGTCGC TCCTCrrCCC 5820 

AGGGGCACCA GCCTGGACGG TGG7GAGCCT dCTGGTGGC CTGATCCTTA CCTTGTCCCC 5880 

TCACACCAGT GGTCACTGGA ACCTTGAACA CTrGGCTGTC GCCCCGATCT GCAGA7G7CA 5940 

AGAAC7TCTG GAAGTCAAAT TACTGCCCAC rrCTCCAGGG CAGATACCTG TGAACAXCCA 6000 

AAACCATGCC ACAGAACCCT GCCTGGGGTC tACAACACAT ATGGACTGTG AGCACCaAGT 6060 

CCAGCCCTGA ATCTGTGACC ACCTGCCAAG ATCCCCCTAA CTGGGATCCA CCAATCACTG 6120 

CACATGGCAG GCAGCGAGGC TTGGAGGTGC T7CGCCACAA GGCAGCCCCA ATTTCCTG6G 6180 

AGTTTCTTGG CACCTCGTAC TGCTCAGGAC CCTTGGGACC CTCACGAXXA CTCCCCrOA 6240 

CCAIAGTGCG CACCCTTCTG CATCCCCAGC AGGT6CCCCC CTCTTCAGAC CCTCTCTCTC 6300 

TCAGGTTTAC CCAGACCCCT 6CACCAATGA CACCATGCTG AAGCCTCA6A GAGAGAGATC 6360 

GAGCTTTGAC CAGGAGCCGC TCTTCCTTGA GGGCCAGGGC AGGGAAAGCA 6GAGGCA6CA 6420 

CCAGGAGTCG GAACACCAGT GXCTAAGCCC CTGAXGAGAA CAGGGTGGTC TCTCCCAiaT 6480 

GCCCATACCA GGCCTGTGAA CAGAATCCTC C7TCTCCAGT GACAATGTCT GAGAGGACCA 6540 

CATGTTTCCC AGCCTAACGT GCAGCCATGC CCATCTACCC ACTGCCTACT GCAGGACaCC 6600 

ACCAACCCAG GAGCTGGGAA GCTGGGAGAA GACATGGAAT ACCCATGGCT tCTCA CC TT C 66^0 

CTCCAGTCCA GTGGGCACCA TTTATGCCTA CGACACCCAC CTGCCGCCCC CAGG CTCHA 6720 

AGAGTTAGGT CACCTAGCTG CCTCTGGGAG CCCGAGGCAG GAGAATTGCT TGAACCCCC6 6780 

AGGCACAGGT TGCAGT6AGC CGAGATCACA CCACTGCACT CCAGCCTGGG TGACACAATG 6840 
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AGAcrcrsrc tcaaaaaaaa agagaaagat agcatcagtg gctaccaagg gctacgggca 

GGGGAAGGTG GAGACrtAAr GATTAATAGT ArSAAGTTTC TAXGTGAGAT GATGAAaATG 
TTCTGGAAAA AAAAATA7AG rGGTGAGGAT GrAGAATATT GTGAATATAA TTAAC5GCAT 
TTAArrSTAC ACTTAACATG ArTAATGTGG CACArTTTAT CTTATGTATT TGACTACATC 
CAAGAAACAC TGGGAGAGGG AAAGCCCACC ArGIAAAATA CACCCACCCT AATCAGASAG 
TCCrCArrCT ACCCAGGTAC AGGCCCCTCA rSACCTGCAC AGGAATAACT AAGGAntAA 
GGACAtGAGG CTTCCCAGCS AACTGCAGGT 3CACAACATA AArGTATCTG CAAACAGACX 
GAGAGTAAAG CTCGGGGCAC AAACCTCAGC ACTGCCAGGA CACACACCCT TCTCG?GGAT 
TCTGACTTTA TCTGACCCCG CCCACTCTCC AGATCTTGTT GTGGGATTGG GACAACGGAG 
GTCArAAAGC CTGTCCCCAG GGCACTCTGT GTGAGCACAC GAGACCTCCC CACCCCCCCA 
CCGTTAGGrC TCCACACAtA GArCTGACCA rTAGGCATTG TCAGGAGGAC TCTAGCGCGG 
GCTCAGGGAT CACACCAGAG AATCAGGTAC AGAGAGGAAG ACCGGGCTCG AGGAGCTGAT 
GGATGACACA GAGCAGGGtr CCTGCAGTCC ACAGG7CCAG CTCACCCTGG TGTACGTGCC 
CCATCCCCCT GATCCAGGCA rCCCTGACAC AGCTCCCTCC CGGAGCCTCC TCCCAGGTGA 
CACArCAGGG TCCCTCACTC AAGCTGTCCA GAGAGGGCAG CACCTTGGAC AGCGCCCACC 

CCACrrCACT cttcctccc? cacagggctc agggctcagg gctcaagtct cagaacaaat 

GGCAGACGCC AGTGAGCCCA GAGATGGTGA CACGGCAATG ATCCAGGGGC AGCTGCCTGA 
AACGGGAGCA GGTGAAGCCA CAGATGGGAG AAGAtGGTTC AGGAAGAAAA ATCCAGGAAT 
GGGCAGGAGA GGAGAGGAGG ACACAGGCTC TCrCGGGCTC CAGCCCAGGA TCGGACTAaG 
TGTGAAGACA TCTCAGCAGG TGAGGCCAGG TCCCAXGAAC AGAGAAGCAG CTCCCACCTC 
CCCTGATGCA CGGACACACA GAGTGTGTGC TGCTGTGCCC CCACAGTCGG 6CTCTCCTGX 
TCTCGTCCCC A6GGAGTGAC AAGTGACCTT GACTTCTCCC TGCTCCTCTC TGCTACCCCA 
ACATTCACCT TCTCCTCATC CCCCTCTCTC TCAAAIATGA TTTGGATCTA TGTCCCCCCC 
CAAATCrCAT GTCAAATTGT AAACCCCaAT CTTGGAGGTG GGGCCaTGTG AGAACTCATT 
G6AXAAIGCC CGTG6ATTTT CTGCTTTGAT GCrGTTTCTG TGAIAGAGAT CTCACAIGaX 
CTGGTTGTTT AAAAGT6TGT A6CACCTCTC CCCTCTCrCT CTCTCTCTCT TACaxaiGCT 
CTGCCaiGtA AGACGTTCCr 6TTTCCCCTT CACCGTCCAG AAXCAITGTA AGTTTTCrSA 
GGCCTCCCCA CGAGCA6AAG CCACTAT6CT rCCXGTACaA CTGCAGAATC ATGAGCGAAT 
TAAACCTCTT TTCTTTAIAA ATTACCCAGT CTCAGGTATT TCCrCATAGC AATGCGACGA 
CAGACTAATA CAATCTTCtA CTCCCAGATC CCCGCACACG CTXAGCCCCA GACATCACTG 
CCCCTGGGA6 CATGCACAGC GCAGCCTCC? GCC5ACAAAA CCAAAGTCAC AAAAGGTCAC 
AAAAATCTGC ATTTGGGGAC ATCTGATTGT GAAAGAGGGA GGACAGTACA CTTGTAGCCA 
CAGAGACTGG GGCTCACCCA GCTGAAACCT CGTAGCACtT TGGCATAACA TGTGCATGAC 
CCCTGtrCAA TGTCTACAGA TCACTGTTGA GTAAAACACC CTCGTCTGGG GCCGCTCCTG 
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Tccccacrrc ccrccrsrcc 
caccGGCTGc TGAccrcccr 

GTCACCTCCG CCCaCSTCC? 
TCCCTGCAGC CTATCGGCCG 
GCTGCCCTCT GCAGGCAGCr 
AGGGCGTCTC CCAGGGCACC 
CGACGCCCCC TATGCTGrCA 
GAAGTTCACT GAGCCCCTGT 
GTCCTCCTCT CTCTGGAATC 
AIAAATACTG GTTAAA?GrS 
CCCCAGCAGT CAGAGTCTST 
GGCCAGCAAG TGCCCGCGOC 
GGAGGACAGA GGCCTGCAGG 
TTCAGGGTCA GCTCACACCA 
GGCCCGAGGT CTCCTCTTGC 
ACGGATTCCC CTGGACAGGA 
CATCCCTTGC TCCAACAGAC 
GGTCCTATGA GGACCCrCTT 
ACCACCATGG GGAAG6TGGG 
CCAGGACTGT CAGGGAGAAC 
AGAG6CAGGG TCCCCCTGGA 
CCTCCACCAC A6TCCTCTCT 
TCTCGGCACT CCTGACACCT 
6TGGTAC7GG AGACA6AGGG 
GCCCCCA6AG CCACCTCTGT 
GGAGAGCATG GG6AGACC0G 
CCCTGGTGTG ACAGACCCAA 
TGTCCTCCCA GGGGATGGGG 
AAATAGAAGG GAAAAAAGAG 
GGACACCTGA ATAAAGACCA 
AGGAAGAGAC 7CAGGGCAGA 
ACAAAACGTT CCTGGAACTC 
ACCATGGAGT CTCCCTCCGC 
CTCACACGTG AAGGCAGGAC 



ACCAGAGGGC GGCAGAGTTC CTCCCACCCT CGAGCCTCCC 8940 

CAGCCGGGCC CACAGCCCAG CAGGGTCCAC CCTCACCC6G 9000 ' 

CCTCGCCCTC CSAGCTCCTC ACACSGACTC TGTCACCTCC 9060 

CCCACCTGAG GCTTGTCGGC CGCCCACTTG AGGCCTSTCG 9120 

ccrcrccccT acaccccctc cttccccggg CTCACcraaA 9180 

TCCCTGTGAr CTCCAGGACA GCTCAGTCTC TCACACGCTC 9240 

CCTCACAGCC CrOTCATTAC CArtAACTCC TCAGTCCCAT 9300 

C7CCCGGTTA CAGGAAAACT CTGTCACAGG CACCICGTCT 9360 

CCAGGGCCCA CCCCAGTGCC TGACACGGAA CAGArSCTCC 9420 

TGGCAGATCT CTAAAAAGAA GCArATCACC TCCGrsrSGC 9480 

rCCArCTGGA CACAGGGGCA CTGGCACCAG CATGGGAGGA 95 40 

TGCCCCAGGA ArGAGGCCTC AACCCCCAGA GCTTCAGAAG 9600 

GAATAGATCC 7CCGGCCTGA CCCTGCAGCC TAATCC?.CAG 9660 

CGTCGACCCT GGTCAGCATC CCTACGGCAG TTCClGACaA 9720 

CCrCCAGGGC GTCACATTGC ACACAGACAX CACTC^GGaA 9780 

ACCTGGCTTT GCTaAGGAAG TGGAGGTGGA GCCTGGmC 9840 

CCTTCTGATC rCTCCCACAT ACCTGCTCTG TTCCTTrCTG 9900 

CTGCCAGGGG TCCCTGTGCA ACTCCAGACT CCCTCCWGT 9960 

CTGATCACAG GACAGTCAGC CTCGCACAGA CAGAGACCAC 10020 

ATGGACAGGC CCTGAGCCGC AGCTCAGCCA ACAGACaCGG 10080 

GCCTTCCCCA AGGACAGCAG AGCCCAGAGT CACCCACCTC 10140 

TTCCAGGACA CACAAGACAC CTCCCCCTCC ACATGCAGGA 10200 

CTGGGCCTCG GTCTCCATCC CTGGGTCACT 6GCCGGGTTG 10260 

CT6GTCCCTC CCCAGCCACC ACCCAGTGA6 CCTTTTTCTA 10320 

CACC7TCCTG TIGGGCATCA TCCCACCTTC CCAGAGCOCT 10380 

GGACCCTGCT GGGTTTCTCT GTCACAAAGG AAAAIAAICC 10440 

G6ACA6AACA CAGCAGAGGT CAGCACTGG6 GAAGAQ66T 10500 

GTCCATCCAC CTTGCCCAAA AGATTTGTCT GAGGAACT6A 10560 

GAGGGACAAA AGAGGCA6AA ATGAGAGGGG AGGGGACAGA 10620 

CACCCATGAC CCACGTGATG CTGAGAAGTA CTCCTCCCCT 10680 

GGGAGGAAGG ACAGCAGACC AGACAGTCAC AGCACCCTTG 10140 

AAGCTCTTCT CCACAGAGGA GGACACACCA GACAGCAGAG 10800 

CCCTCCCCAC ACATGGTGCA TCCCCTCGCA GAGGCTCCTC 10860 

AACCTGGGAG AGGGTGGGAG GAGGGAGCTG GGGTCTCCTG 10920 
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CSCraCCACAG GGCTGTGAGA CGGACaGACC CCTCCTGTTG 
ACATCACAGA GGGACAGGAG 7CACACCACA AAAATCAAAT 
6CAGGAAAAC CTCAAGAG?? CTATTTTCCT AGTTAATTGT 
AAAATCATAA TAACTGCATC AGATGACACT TTAAATAAAA 
ACTGTCCTCA TCCGCCT^.CC GCGGACATTG GAAAATAAGC 
66GAACCCTC ATGAACTCAr CCACAGGAAT CTCCAGCCTG 
CCAAGATC 

(2) INFORMATION ?CH SSQ ID NO: 2: 

(i). SEQUENC2 CHARACTTRISTICS: 
•!'!-(A) LENGTH: 3774 base rairs 

(B) TYPZ: nucleic acid" 

(C) ST3A:.TrDN2SS: single 

(D) TOPCLCGr: linear 

(ii) M0L£CUL2 r/P2: DNA (gencaic) 
Mil \ Bvor\»w!?»»w»#^ • . 

(iii) ANTI-SSNS2: NO 

(V) FRAGHENT r!r?S: N-terainal 



GACCCTGAAT 
TCAACTGGAA 
CACrCGCCAC 
ACATAACCAG 
CCCACGCTGT 
TCCCAGGCAC 



AGGGAAGAGG 
TTGGAi^ACCG 
TACGTTmA 
GGCATGAAAC 
GGAGGGCCCT 
TGGGGTGC^ 



10980 

11040 

11100 

11X60 

11220 

11280 

11288 



(xi) SEQtJENCS DESCSXPTION: SZQ ID NO: 2: 
AACCTTTTTA GTCCTTTMA CAGTGAGCT^ 

aCTCaCCCC CACAAGTGAA GGGTGAAGCT GGGTGGAGCC AAACCAGGCA AGCCTaCCCT 
CAGGGCTCCC AGTGGCCTGA GAACCATTGG ACCCAGGACC CAnACTTCT AGGGXaaCGA 
AGGTACAAAC ACCAGATCCA ACCAT6GTCT GGGCGGACAG CTGTCaAATG CCXAAAaAIA 
TACCTG6GAG AGGAGCACGC AAACTATCAC TCCCCC&G6T TCTCIGAACA GAAACacaCG 300 
GGCAACCCAA AGTCCAAATC CAGGTGAGCA GGT6CACCAA ATGCCCftGAG ATATGACGAG 360 
6CAAGAA6TG AAGCAACCAC CCCTGCATCA AATCrTTTGC AT6GSAA6GA GAAGGGGGTT 
6CTCATGTTC CCAATCCaGG ACAAT6CATT TCGCATCTCC CTTCTTCTCA CTCCTTGGTT 
AGCAA6ACTA AGCAACCaCG ACTCTGGATT TGGCGAAAfiA CGTTTaTTTG TGGAGGCCaG 
TGATGACAAT CCCACGAGGG CCIAGGTGAA GAGGGCAGGA AGGCTCGAGA CACTGGGGAC 
TGAGTGAAAA CCACACCCAX 6ATCTGCACC ACCCATGGAT GCTCCTTCAr TGCXCACCTT 
TCTGTTGATA TCAGATGGCC CCATTTTCTG XACCTTCACA GAAGGACACA GGCTAGGGTC 
TGT6CATCGC CTTCATCCCC GGGGCCATGT GACGACAGCA GGTGGGAAAG ATCATCGGTC 780 
CTCCTGGGTC CTCCAGGGCC AGAACATTCA TCACCCATAC TGACCTCCTA GATGCGAaiG rf40 
GCTTCCCTGG GGCTGGGCCA ACGGGGCCTG CGCAGGGGAG AAAGGACGTC AGGGGACACG 900 
GAGGAAGGGT CATCGAGACC CACCCTGGAA CCTTCTTGTC TCTGACC31TC CAGGATTtAC 



60 
120 
180 
240 



420 
480 
540 
600 
660 
720 



960 



TTCCCTGCAT CTACCTTTCG TCATTTTCCC TCAGCAATCA CCAGCTCXCC TTCCTCAICT 1020 
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cacccTccca ccctcgacac agcaccccag rcccrooccc gcctgcatcc AcccsAsacc loao 

CTGaraACCC AGCACCCar? ACTTCTASGG rAAGGAOCCT CCAG5AGACA CAAGCTGAGG 1140 

AAAGCTCrCA AGAAGTCACA TCTGTCCTGC CCACAGGGOA AAAACCATCA GATSCtSAAC 1200 

CA66A6AAT6 TT6ACCCACG AAAGGGACCG AG5ACCC3UV6 AAAGGAGTCA GACCACCAOC 1260 

GTTTGCCTGA GAGGAAGGAr CAAGGCCCCO AGGGAAAGCA GGGCrGCCTG CATGTGCASG 1320 

ACACTGGTGG GGCATArGTG TCTTAGATTC 7CCCTGAATT eAGTGTCCCT GCCATGGCCA 1380 

GACTCrCTAC TCAGGCCTGG ACATGCTGAA ATAGGACAAT GGCCTTGTCC TCTCTCCCCA 1440 

CCATTTGGCA AGACACArXA AGGACATTCC AGGACATGCC TTCCTSGGAG GTCCAGGSTC 1500 

TCTGTCTCaC ACCTCAGGGA CTGTAGTTAC TSCArCAGCC ATGCTAGCTG CTCArCTCAC 1560 

CCAGCCTGrc CAGGCCCTTC CACTCTCCAC rTTSTGACCA TGTCCAGGAC CACCCCTCAG 1520 

ATCCTCAGCC TGCAAATACC CCCTTCCTGG CrGGGTGGAT TCAGTAAACA GTCAGCTCC7 1680 

ATCCAGCCCC CAGAGCCACC rCTGTCACCT rCCTGCTGGG CATCArCCCA CCTTCACAAG 1740 

CACTAAAGAG CATGGGGAGA CCTGGCTAGC 7GGG?rrCTG CATCACAAAG AAAATAATCC 1800 

CCCAGGTTCG GATTCCCAGG GCTCTGTATG roCAGCTGAC AGACCTGAGG CCAGGAGATA 1860 

GCAGAGGTCA GCCCTAGSGA GGGTGGGTCA "CCACCCaGG GGACAGGGGT GCACCA6CCT 1920 

TGCTACTGAA AGGCCCTCCC CAGCACaGCG CCAtCAGCCC TGCCTGAGAG CTTTCCTaAA 1980 

CA6CAGXCAG AGGAGGCCA? GCCAGTGGCT CAGCTCCTGC TCCAGGCCCC AACACACCAG 2040 

ACCAACaCCA CAATGCaCTC CTTCCCCAAC GTCACACCXC ACCSAAGGGA AACTGAGGTG 2100 

CiacCIAACC TTAGAGCCAT CAGGGGA6AT AACAGCCCAA TTTCCCAAAC AGGCCAGTTT 2160 

CAATCCCATG ACAAlGACCr CTCTGCXCTC ASTCTtCCCA AAASAGGACG CTGAITCTCC 2220 

CCCACCAIGG ATTTCTCCCT TGTCCCGGGA C CC ITTTCTG CCCCCrATGA TCTCGGCACT 2280 

CCIGAC&CaC ACCTCCTCTC TGGTGACAaa TCAGGGTCCC TCACrGTCAA GCAGTCCASA 2340 

AAGGACAfiAA CCTTGGACaG CCCCCAICTC ACCTTCaCCC TTCCTCCTTC ACAGGGTTCA 2400 

GGGCAAACaa TAAATGCCaC AGGCCAGTGA CCCC3USACAT GGTCACAGGC ACTGACCCaO 2460 

GGCCAGATGC CTGCAGCACG A6CIGCCGCG GCCACAGGGA GAAGGTGATG C3«WAAGGSA 2520 

AACCC a SA A A TGGGCAGGAA AGGA6GACAC AGGCTCTCTG GGGCTGCAGC CC3USGGTTG6 2580 

ACT&TGACTG TGAAGCCaiC TCAGCAAGTA AGGCCAGGTC CCATGAACaA GAGTGG6A0C 2640 

ACGTGGCITC CTGCTCTGTA TATGGGGTGG GGGATTCCAT GCCCCATAGA ACCAGATGCC 2700 

C6CG6TTCA6 ATGGAGAACG AGCAGGACAG GGGATCCCCA GGA7AGGAGG ACCCCAGTGT 2760 

CCCCACCCA6 GCAGGTGACT GATGAATGCG CATGCAGGGT CCTCCTGGGC TCGCCTCTCC 2820 

CTTTGTCCCT CAGGATTCCT TGAAGGAACA TCCGGAAGCC GACCACATCT ACCTGCTGGG 2880 

TTCTGGCGAG TCCaVTGTAAA GCCAGCAGCT TCTGTTGCTA CGAGGGCTCA TGCCAIGTCC 2940 ' 

TGGGGGCACC AAAGAGAGAA ACCTCAGGGC AGCCAGGACC TGGTCTGAGG AGGCATCGG& 3000 

GCCCAGATGG GGAGATGGAT GTCAGGAAAG GCTGCCCCAT CaGGGACGGT GATAGCAATG 3060 
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GGGGGrCTGT GGGACrSGGC ACSTGGGATT CCCTCGCCTC tCCCaACTTC CCTCCCmC 3120 

TCACaACCTC CGGACACrSC CCATGAAGGG GCGCCTTTGC CCAGCCAGAT GCTGCTGGTT 3180 

CTGCCCATCC ACTACCCTCT CTGCTCCAGC CACTCTGGGT CTTTCTCCAG ATGCCCTCCA 3240 

CAGCCCTCGC CTGGGCCTGr CCCCTGAGAC G7GTTGCGAG AAGCTGAGTC TCTGGGCJU31 3300 

CTCTCATttG AGTCTC-AAAG GCACATCAGG AAACATCCCT GGTCTCCAGG ACXaGGCaAT 3360 

GAGGAAACGG CCCCAGCTCC TCCCTTTGCC ACTGAGACGG TCGACCCTG6 GTGGCC&CaG 3420 

TGACrrCTGC GTCTGtCCCA GTCACCCTGA AACCACAACA AAACCCCACC CCCaGACCCT 3480 

GCAGGTACAA TACATGTSGG GACAGXCTG7 ACCCAGGGGA AGCCaGTTCT CTCTTCCTAG 3S40 

GAGACCGGGC CTCAGGGCTG TGCCCCGGCC ACGCGGCGGC AGCACGTGCC TGTCCncaG 3600 

AACTCGGGAC CTTAAGGGtC TCTGCTCrGT GAGGCACAGC AAGGMCCTT CTGTCCaGaG 3660 

AXGAAMCAG CTCCTGCCCC rCCTCTGACC TCTTCCTCCT TCCCaaATCT CAACCaACAA 3720 

AIMGrcrrT CAAATCTC?.? CATCAAATCT TCATCCATCC ACATGAGAAA GCTT 3774 
(2) UiTGRKATION ?C5 S2Q ID NO: 3: 

(i) SSQUENCS CaSACTERISTICS: 

(A) LZNGtE: 40 base pairs 

(B) Ty?2: aucieic acid 

(C) STHA^ISSCNSSS: single 

(D) TOPCIOGV: linear 

(ii) MOLECULZ DIIA (genomic) 

(iii) mrPOTHSriCaL: MO 

(iii) ANTI-SSNS2: JTO 

(v) FRAGMENT T^Ss N-tezminal 



(xi) SSQUENCS aS5CRI?TI0N: SZQ ZD NO: 3: 
CCCTGTGaXC TCCAiGGACaG CTCAGTCrCC GTCCAATCTC 
(2) INFORKAXION FOR SZQ ID NO: 4: 

(i) SEQUENCE CSaRACTERISTICS: 

(A) LENGTS: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STKA2IDE0NESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECtlLE TTPE: DNA (genomic) 
(iii) HTPOTHETICaL: NO 
(iii) ANTI-SENSE: NO 

(▼) FRAGMENT TTPS: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GTTTCCIGAG TGATGTCTGT GTGCAATG 



40 



28 
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(2) INFQRKATZOir FOR SEQ ZD NO: 5: 

(i) SZQOEKCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRA^TOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TTPE: DNA (gencmic) 
(iii) HyPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(V) FRAGMENT TTPE: N-terminai 



(xi) SEQUENCE DESCRIPTION: SEQ S) NO: 5: 
CCTGGAACTC AACCTTGAAT TCTCCACAGA GGA6G 



35 
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dAZHS: 

1. A DNA molecule comprising the carcinoembryonic 
antigen (CEA) transcriptional regulatory sequence (TRS) 
but without associated CEA coding sequence. 

2. A molecular chimaera comprising a CEA TRS and a DNA 
sequence operatively linked thereto encoding a 
heterologous enzyme. 

3. A molecular chimaera according to claim 2 wherein the 
heterologous enzyme is capable of catalysing the 
production of an agent cytoxic or cytostatic to CEA+ 
cells. 

4. A molecular chimaera according to claim 3 wherein the 
heterologous enzyme is cytosine deaminase (CD) . 

5. A molecular chimaera according to any of claims 2 to 
4 wherein the CEA TRS and the sequence encoding a 
heterologous enzyme are in an expression cassette. 

6. A molecular chimaera according to claim 5 which 
comprises DNA sequence of the coding sequence of the gene 
coding for the heterologous enzyme and additionally 
includes an appropriate polyadenylation sequence which is 
linked downstream in a 3' position and in proper 
orientation to the CEA TRS. 

7. A retroviral shuttle vector comprising a molecular 
chimaera according to any of claims 2 to 6. 

8. A retroviral shuttle vector according to claim 7 
comprising a DNA sequence comprising a 5* viral LTR 
sequence, a cis acting psi encapsidation sequence, the 
molecular chimaera and a 3' viral LTR sequence. 
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9. A retroviral shuttle vector according to claim 8 
based on Moloney murine leukaemia virus. 

10. A retroviral shuttle vector according to cmy of 
claims 7 to 9 which is a SIN vector. 

11. An infective virion comprising a retroviral shuttle 
vector according to any of claims 7 to 10, the vector 
being encapsidated within viral proteins to create an 
artificial, infective, replication defective, retrovirus. 

12. A packaging cell line comprising a retroviral 
shuttle vector according to any of claims 7 to 10. 

13. A pharmaceutical composition comprising an infective 
virion according to claim 11 or packaging cell line 
according to claim 12 together with a pharmaceutically 
acceptcQsle carrier. 

14. Use of CEA TRS for targeting expression of a 
heterologous enzyme to CEA'*' cells. 

15. Use according to claim 14 wherein the heterologous 
enzyme is capable of catalysing the production of an 
agent cytotoxic or cytostatic to CEA"*" cells. 

16. Use according to claim 15 wherein the heterologous 
enzyme is CD. 

17. A DNA milecule according to claim 1 which comprises 
one or more of the following sequence regions of the CEA 
gene in either orientation: 

about -299b to about +69b, more preferably cUdout -90b to 
about +69b; 

-14.4kb to -10.6kb, preferably -13.6kb to -10.6kb; 
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-S.lkb to -3.8kb. 



18. A molecular chimaera according to any of claims 2 to 
6, retroviral shuttle vector according to any claims 7 to 
10, packaging cell line according to claim 12 or 
composition according to claim 13 wherein the CEA TRS 
comprises one or more of the following sequence regions 
of the CEA gene in either orientation: 

about -299b to about +69b, more preferably about -90b to 
about +60b; 

-14.4kb to -10.6kb, preferably -13.6kb to -lO.eicb; 
-e.lJcb to -S.Skb. 

19. Use according to any of claims 14 to 16 wherein the 
CEA TRS comprises one or more of the following sequence 
regions of the CEA gene in either orientation: 
about -199b to about +69b, more preferably about -90b to 
about +69b; 

-14.4kb to -I0.6kb, preferably -I3.6kb to -lo.6kb; 
-e.llcb to -3.8]cb. 
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P^asmid CEA Coordinates 

pCRna (-299 to ♦69) 

pCRlOB (-1664 to ♦69) 

pCRUB ( -14462 to -10691 )^(-299 to ♦ 69 ) 

pun If 0 I -03 10 -itU |+(-90 to ♦ 69 ) 

pCRIBB [3X(-89to-40)]*(-90to+69) 

PCR136 (-3919 to-6071)^(-299to*69) 

pCR137 (-6071 to -3919 ) ♦ (-299 to +69 ) 

pCR162 (-13579 to -10691 )*(-89 to -40 )*(-90 to ^69) 

PCR163 (-10691 to-13579)^(-89to-40)*(-90to*69) 



Fig3B 
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-14463 AAGCTTTTTA GTOCTTTAJSA CAGTQAGCTG GTCTCTCTAA CCCAAGTGAC CTGGGCTC 

-14403 TACTCAGCCC CAGAAGTGRA GGGTGAAGCT GGGTCGaCCC AAACCAGGCA AGCCTACC 

-14343 CAGOGCTCCC AGTGGCCTCSA C5AACCATTC3G ACCCAGGACC CATTACTTCT AGGGTAAG 

-14283 AGGTACAAAC ACCAGATCCA ACCATGGTCT GGGGGGACAG CTCTCAAATG 

-14223 TACCTGGGAG AGQAGCAGGC AAACTATCAC TGCCCCAGGT TCTCTGAACA GAAACAGA 

-14163 GGCAACCCAA AGTCCAAATC -CAGGTGAGCA GGTGCACCAA ATGCCCAGAG ATATGACG 

-14103 GCAAQAAGTG AAGGAACCAC CCCTGCATCA AATGTTTTGC ATCGGAAGGA GAAGGGGG 

-14043 GCTCATGTTC CCAATCCAGG AGAATGCATT TGGGATCTGC CTTCTTCTCA CTCCTTGG 

-13983 AGCAAGACTA AGCAACCAGG ACTCTGGATT TGGGGAAAGA CGTTTATTTG TGGAGGCC 

-13923 TGATGACAAT CCCACGAGGG CCTAGGTQAA GAGGGCAGGA AGGCTCGAGA CACTCGGG 

-13863 TGAGTGAAAA CCACACCCAT GATCTGCACC ACCCATGGAT GCTCCTTCAT T6CTCACC 

-13803 TCTGTTGATA TCAGATGGCX: CCATTTTCTG TACCTTCACA GAAGGAC31CA GGCTAGGG 

-13743 TGTGCATGGC CTTCATCCCX: GGGGCCATGT GAGGACAGCA GGTGGGAAAG ATCATGGG 

•13683 CTCCTGGGTC CTGCAGGGCC AGAACATTCA TCACCCATAC TGACCTCCTA GATGGGAA 

■13623 GCTTCCCTGG GGCTGGGCCA ACGGGGCCTG GGCAGGGGAG AAAG6ACGTC AGGGGACA 

•13563 GAGGAAGGGT CSITOGAGACC CAGCCTGGAA GGTTCTTGTC TCTCACCATC CSMSGATTT 

■13503 TTCCCTGCAT CTACCTTTGG TCATTTTCCC TCAGCAATGA CCAGCTCTGC TTCCTGAT 

13443 CAGCCTCCCA CCCTGGACAC AC3CACCCCAG TCCCTGGCCC GGCTGCATCC ACCCftATA 

13383 CTGATAACCC AGGACCCAOT ACITCTAGGG TAAGGAGGGT CCAGGAGAO^ GAAGC^ 

13323 AAAGGTCTGA AGAAGTCACA TCTGTCCTGG CCAGAGGGGA AAAACCATCA GATGCTGA 

13263 CAGGAGAATG TTGACCCAGG AAAGGGACCG AGGACCCAAG AAAGGAGTCA GACCACCA 

13203 GTTTGCCTGA QAGGAAGGAT CAAGGCCCCG AGGGftAAGCA GGGCTGOCTG CATGTGCA 

Fig. 6 (1/11) 
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-13143 ACACTGGTGG GGCATATGTG TCTTAGATTC TCCCTGAATT CAGTGTCCCT GCCATGGC 
-13083 GACTCTCTAC TCAG6CCTGG ACATGCTGAA ATAGGACAAT GGCCTTGTCC TCTCTCCC 



-12963 TCTGTCTCAC ACCTCAGGGA CTGTAGTTAC TGCATCAGCC ATGGTAGGTG CTGATCTC 

-12903 CCA6CCTGTC CAGGCCCTTC CACTCTCCAC TTTGTGACCA TGTCCAGGAC CACCCCTC 

-12843 ATCCTGAGCC TGCAAATACC CCCTTGCTGG 6TGGGTGGAT TCAGTAAACA GTGAGCTC 

-12783 ATCCAGCCCC CAGAGCCACC TCTGTCACCT TCCTGCTGGG CATCATCCCA CCTTCACA 

-12723 CACTAAAGAG CATGGGGAGA CCTGGCTAGC TGGGTTTCT6 CATCACAAAG AAAATAAT 

-12663 CCCAGQTTCG GATTCCCAGG GCTCT6TATG TGGAQCTQAC AGACCTGAGG CCAGGAGA 

-12603 GCaGAGGTCA GCCCTAGGGA GGGTGGGTCA TCCACCCftGG GGACAGGGGT GCACCAGC 

-12543 TGCTACTGAA AGGGCCTCCC CAGGACAGCG CCATCAGCCC TGCCTGAGAG CTTTGCTA 

-12483 CAGCAGTCAG AGGAGGCCAT GGCAGTGGCT GAGCTCCTGC TCCAGGCCCC AACAGACC 

-12423 ACCAACAGCai CftATGCAGTC CTTCCCCAAC GTCACAGGTC ACCAAAGG6A AACTQAGG 

-12363 CTACCTAACC TTAGAGCCAT CAGGGGAGAT AACAGCCCAA TTTCCCAAAC AGGCCAGT 

-12303 CAATCCCATG ACAATGACCT CTCTGCTCTC ATTCTTCCCa AAATAGGACG CTGATTCT 

-12243 CCCACCATGG ATTTCTCCCT TGTCCCQGGA GCCTTtTCTG CCCCCTATGA TCTGGGCA 

-12183 CCTGACACAC ACCTCCTCTC TGGTGACATA TCAGGGTCCC TCACTOTCAA GCAGTCCA 

-12123 AAGGACaGAA CCTTGGACAG CGCCCATCTC AGCTTCACCC TTCCTCCTTC ACAGGGTT 

-12063 GGGCAAAGAA TAAATGGCAG AGGCCAGTGA GCCCA6AGAT GGTCACAGGC AGTGACCC 

-12003 GGGCAGATGC CTGGAGCAGG AGCTGGCX3GG GCCACaGGGA GAAGGTGATG C^GGAftGG 

-11943 AACCC3MSAAA TGGGCAGGAA AGGAGGACAC AGGCTCTGTG GGGCTGCAGC CCAGGGTT 

-11883 ACTATGAGTG TGAAGCCATC TCAGCAAGTA AGGCCAGGTC CCATGftACAA GAGTGGGA 

-11823 ACGTGGCTTC CTGCTCTGTA TATGGGGTGG GGGATTCCAT GCCCCATAGA ACCAGATG 



-13023 



CCATTTGGCa. AGAGACATAA AGGACATTCC AGGACATGCC TTCCTCGGAG GTCCAGGT 
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-11763 CGGGGTTCAG ATGGAGAAGG AGCAGGACAG G^^^ 
-11703 CCCO^CCOUS GCAGGTCACT GAT^TGGG 

-11643 CTTTCTCCCT CAGGATTCCT TOAAGGAAOV TCCGGAAGCC GACCACATCT ACCTGGTO 
-11583 ^CTGGGGAG TCCAT^TAAA OrCAGGAGCT TGT^^^ 
-11523 ^CACC AAAGAGAGAA ACCTGAGGGC AGGCAGGACC ^ 
-11463 GCCX3.GATGG GGAGATGGAT GTCAGG^v^AG GCTGCCCCAT CAGGGAGGGT GATAGCAA 
-11403 GGGGGTCTGT GGGAGT^C ACGTCGO^^rr CCCT.3GGCTC OGCCAAGTTC CCTCCCAT 
-11343 TCACAACCTG GGGACACTGC CCATGAAGGG GCGCCTTTGC CCAGCCAGAT GCTOCI^ 
-11283 CTGCCCATCC ACTACCCTCT CTOCTCCAGC avCTCTGGGT CTTTCTCC;VG ATGCCCTG 
-11223 CAGCCCTCGC CTGGGCCTGT CCCCTO^^GAG GTGTTGGGAG AAGCTG^^GTC TCTGGGGA 
-11163 CTCTCTCAG AGTCTGAAAG GCACATC.GG AAACATCCCT GGTCTCOVGG ACTAGGCA 
-11103 ^S'^^SAAAGGG CrCCAOCTC^ 

-11043 TGACTTCTGC GTCTGTCCO. GTCACCCTG;. AACCACAAC;. AAACCCCAGC CCCAGACC 
-10983 «=««^CAA TACATGTGGG GACAGTCTGT ACCCAGGGG^ • 
-10523 GAGACCGGGC CTCAGGGCTG TGCCCGGGOC AGGCGGGGGC A«3CACGT0CC TGTCCTTG 
-10863 AACTCGGGAC CTTAAGGGTC TCTGCTCTGT GAGGCACAGC AAGGATCCIT C^CCAG 
-10803 ^'^'^AAGCAG CTCCTGCCCC TCCTCTG.^^ 

-10743 ATAGGTGTTT CAAATCTCAT CATCAAATCT TCATCCATCC ACATGAOAAA OCTTAAAA 

-10683 ACAACATCAA GAGTTGGAAC AAGO^CAT GGAGA«^^ 

-10623 TTTAGATGTO TTCAGCTATC GGGCAGGAGA ATCTGTGO^ AATTCCAGCA T^GTTCAG 

-10563 <^^CAAAAA CTGTCACAGT CCAAATG^C AACAGTGO^ 

-10503 ^^^CTGAG GGATAmTG GAACATGAGA AAGGAAGGGA T.^^^ 

-10443 ^^TCTCA CACATAGAGT TGAAAGAMO 

Fig. 6 {3 m 
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-10383 AATTCCACCT CTATAAAGTT TCCAAGAGGA AAACCCAATT CTGCTGCTAG AGATCAGA 
-10323 GGAGGTGACC TGTGCCTTGC AATGGCTGTG AGGGTCACX^G GAGTGTCACT TAGTGCAG 
-10263 AATGTGCCGT ATCTTAATCT GGGCAGG6CT TTCATGAGCA CATAGGAATG CAGACATT 



-10143 


ACGCCTGTAA 


TCCCAGCACT TTAGAAGGCT GAGGTGGGCA 


GATTACTTGA GGTCAGGA 


-10083 


TCAAGACCAC 


CCTGGCCAAT ATGGTGAAAC CCCGGCTCTA 


CTAAAAATAC AAAAATTA 


-10023 


TGGGCATGGT 


GGTGCGCGCC TGTAATCCCA GCTACTCGGG 


AGGCTGAGGC 


TGGACAAT 


-9963 


CTTGGACCCA 


GGAAGCAGAG GTTGCAGTGA GCCAAGATTG 


TGCCACTGCA 


CTCCAGCT 


-9903 


GGCA2VCAGAG 


CCAGACTCTG TAAAAAAAAA AAAAAAAAAA 


AAAAAAAGAA AGAAAGAA 


-9843 


AGAAAAGAAA 


GTATAAAATC TCTTTGGGTT AACAAAAAAA 


GATCCACAAA ACAAACAC 


-9783 


GCTCTTATCA 


AACTTACACA ACTCTGCCAG AGAACAGGAA 


ACACAAATAC 


TCATTAAC 


-9723 


ACTTTTGTGG 


CAATAAAACC TTCATGTCAA AAGGAGACCA 


GGACACAATG 


AGGAAGTA 


-9663 


ACTGCAGGCC 


CTACTTGGGT GCAGAGAGGG AAAATCCACA 


AATAAAACAT 


TACCAGAA 


-9603 


AGCTAAGATT 


TACTGCATTG AGTTCATTCC CCAGGTATGC 


AAGGTGATTT 


TAACACCT 


-9543 


AAATCAATCA 


TTGCCTTTAC TACATAGACA GATTAGCTAG 


AAAAAAATTA 


CAACTAGC 


-9483 


AACAGAAGCA 


ATTTGGCCTT CCTAAAATTC CACATCATAT 


CATCATGATG 


GAGACAGT 


-9423 


AGACGCCAAT 


GACAATAAAA AGAGGGACCT CCGTCACCCG 


GTAAACATGT 


CCACACAG 


-9363 


CCAGCAAGCA 


CCCGTCTTCC CAGTGAATCA CTGTAACCTC 


CCCTTTAATC 


AGCCCCAG 


.-9303 


AAGGCTGCCT 


GCGATGGCCA CACAGGCTCC AACCCGTGGG 


CCTCAACCTC 


CCGCAGAG 


-9243 


TCTCCTTTGG 


CCACCCCATG GGGAGASCAT GAGGACAGGG 


CAGAGCCCTC 


TGATGCCC 


-9183 


ACATGGCAGG 


AGCTGACGCC AGAGCCATGG GGGCTGGAGA 


GCAQAGCTGC 


TGGGGTCA 


-9123 


GCTTCCTGAG 


GACACCCAGG CCTAAGGGAA GGCAGCTCCC 


TGGATGGGGG 


CAACCAGG 


-9063 


CCGGGCTCCA 


ACCTCAGAGC CCGCATGGGA GGAGCCAGCA 


CTCTAGGCCT 


TTCCTAGG 



-10203 



TGCTGTGTTC 



ATTTTACTTC ACCGGAAAAG AAGAATAAAA 



TCAGCCGGGC GCGGTGGC 
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-9003 GACTCTGAGG GCaCCCTGAC ACSACRG^ 

-8943 CACGGGACCC TGCTCTCGTC GCWaTCAGG AGAGAGTCGG ACACCATGCC AGGCCCCC 
-8883 GGCATGGCTG CGACTGACCC AGGCCACTCC CCTCCATCCA TCftGCCTCGG TAAGTCAC 
-8823 GACCAAGCCC AGGACCAATC TGGAAGGAAG GJ^^ 
-8763 AAGGTCAGTG CAlU«SU3aGG CCATGAGO^ 

-8703 ACCATCATCT ATCATAAGTA GftMCCCTGC TCCATGACCC CTGCATTTAA ATAAACGT 
-8643 GTTAAATOAG TCAAATTCCC TCACCATGAG AGCTCACCTG TOTGTAGGCC CATCaCAC 
-8583 ACAAACACAC AaiCAa«avC ACACACACAC ACACACaCAC ACAGGGA^ 
-8523 TGGACAGCAC CAGGCAGGCT TCACAGGCAG AGCAAACAGC GTCAATOACC CATGCAGT 
-8463 CCTGGGCCCC ATOU3CTa«5 AGACCCTGTG AGGGCTGAGA 

-8403 ACTTAGAGAG GGTGGGGCCT CCAGGGAGGG GGCTGCAGG6 AGCTGGGTAC TGCCCTCC 

-8343 GGftfiGGGGCT GCAGOGAGCT GGGTACTXKX: CtCCA^ 

-8283 ACTGCCCTCC AGGGAGGGGG CTOCAGGGAG CT^^ 

-8223 AGGGAGCTGG GTACTGCrCT CCAGGGAGGC AGGAGCACTG TTCCC^ 

-8163 TTCCTGCAGC AGCTOCACAG ACACAGGAGC CCCCATGACT GCCCTGGGCC AGGGTOTC 

-8103 "CCAAATTT CGTGCCCCAT TGGGTGGG&C GGAGGTTGAC 

-8043 TCTGATTCCA AACTTAAACT ACTOTGCCta CAAAATAGGA AATAACCCTA CTTTTTCT . 
-7983 TATCTCAAAT TCCCTAAGCA CAAGCTAGCA CCCTTTAAAT CAGGAAGTTC AGTCACTC 
-7923 GGGGTCCTCC CATGCCCCCA GTCTGACTTC CAGOTGCACA GGGTGGCTGA CATCTGTC 
-7863 TGCTCCTCCT CTTO3CTCAA CTGOCGCCCC TCCTG^ 
-7803 GATOrrAGAG CTGGCCCa^T GATTGACAGG AAGGCAGGA 

-7743 TAGGGGTGTC AAGAGAGCTG GGCATCCCAC AGAGCTGCAC AAGATGACGC GGACAGAG 
-7683 ^CAC3«3GG CTCAGGGCTT CAGACGGGTC GGGAGGCTCA GCTG^ 

Fig,6(5/11) 
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-7623 CCTGAGGAGC CTCAGTGGGA AAAGAAGCAC TGAAGTGGGA AGTTCTGGAA TGTTCTGG 

-7563 AAGCCTGAGT GCTCTAAGGA AATGCTCCCA CCCCGATGTA GCCTGCAGCA CTGGAOGG 

-7503 TOTOTACCTC CCCGCTGCCC ATCCTCTC31C AGCCCCCGCC TCTAGGSACA CAACTCCT 

-7443 CCTAACATGC ATCTTTCCTQ TCTCATTCCA CACAAAAGGG CCTCTGGGGT CCCTGTTC 

-7383 CATTGCAAGG AGTGGAGGTC ACGTTCCCAC AGACCACCOl GCAACAGGGT CCTATGGA 

-7323 TCCGGTCAGG AGGATCACAC GTCCCCCCAT GCCCAGGGGA CTGACTCTGG GGGTGATG 

-7263 TTGGCCTGGA GGCCACTGGT CCCCTCTGTC CCTGAGGGGA ATCTGCACCC TGGAGGCT 

-7203 CACATCCCTC CTGATTCTTT CAGCT6AGG6 CCCTTCTTGA AATCCCAGGG AGGACTCA 

-7143 CCCCACTGGG AAAGGCCCA6 TGTGGACGGT TCCACAGCaG CCCAGCTAAG GCCCTTGG 

-7083 ACaGATCCTO AGTGAGAQAA CCTTTAGGGA CACAGGTGCA OQGCCATGTC CCCAGTGC 

-7023 ACACAGAGCA 6GGGCATCTG GACCCTGAGT GTGTAGCTCC CGCGACTGAA CCCAGCCC 

-6963 CCCCAATQAC GTGACCCCTG GGGTGGCTCC AGGTCTCCAG TCCATGCXAC OVAAATCT 

-6903 AGATTGAGGG TCCTCCCTT6 AGTCCCTQAT GCCTGTCCAG GAGCTGCCCC CTGAGCAA 

-6843 CTAGAGTGCA GAGGGCTGGG ATTGTGGCAG TAAAAGCAGC CACATTT6TC TCAGGAAG 

-6783 AAGGGAGGAC ATGAGCTCCA GGAAGGGCGA TGGCGTCCTC TAGTGGGC6C CTCCTGTT 

-6723 TGAQCAAAAA GGOGCCaGGA GASTTGAGAG ATCAGGGCTO GCCTTGGACT AAGGCTCA 

-6663 TGGA6AGGAC TQAGGTGCAA A6AGGGGGCT GAAGTAGGGG AGTGGTCGGG AGAGATGG 
-6603 . GGAGCAGGTA AQGGGAAGCC CCAGGGAGGC CGGGGGAGGG TACAGCAGAG CTCTCCAC 

-6543 CTCAGCATTG ACATTTGGGG TGGTCGTGCT AGTGGGGTTC TGTAAGTTGT AGG6TGTT 

-6483 GCACCATCTG GGQACTCTAC CCACTAAATG CCAGCAGGAC TCCCTCCCCA AGCTCTAA 

-6423 ACCaACAATG TCTCCAGACT TTCCAAATGT CCCCTGGAGA GCAAAATTGC TTCTGGCA 

-6363 ATCACTGATC TACGTCAGTC TCTAAAAGTG ACTCATCAGC GAAATCCTTC ACCTCTTG 

-6303 AGAAQAATCSl CAA6TGTGAG AGGGGTAGAA ACTGCAGACT TCAAAATCTT TCCAAAAG 
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-6243 TTTTACTTAA TCAGCAGTTT GATGTCCCAG GAGAAGATAC ATTTAGAGTG TTTAGAGT 

-6183 ATGCCRCATG GCTGCCTGTA CCTCACaGCA GGAGCAGA6T OGGTTTTCCA AGGGCCTQ 

-6123 ACCACAACTG OaATOACACT CACT0G6TTA CATTACAAAG TGGAATGTCG GGAATTCT 

-6063 AGACTTTGGG AAGGGAAATG TATGACGTGA- GCCCACAGCC TAAGGCAGTG GACAGTCC 

-6003 TTTGAGGCTC TCACCATCTA GGASACATCT CAGCCATGAA CATAGCCACA TCTGTCAT 

-5943 GAAAACATGT TTTATTAAGA OGSAAAAATCT AGOCTAORAS TGCTTTATGC TCTTTTTT 

-5883 CTTTATGTTC AAATTCATAT ACTTTTAGAT CATTCCTTAA AGAAGAATCT ATCCCCCT 

-5823 GTAAATGTTA TCACTGACTG GATA6TGTTG GTGTCTCACT CCCAACCCCT GTGTGGTC 

-5763 AGTGCCCTGC TTCCCCAGCC CTGGGCCCTC TCT6ATTCCT GAGAGCTTTG GGTGCTCC 

-5703 CATTAGGAGG AAGAGROGAA GGGTOmTT AATATTCTCA CCATTCACCC ATCCACCT 

-5643 TA6ACACTG6 GAAGAATCAG TTGCCCACTC TTCGATTK3A TCCTCQAATT AATQACCT 

-5583 ATTTCTGTCC CTTGTCCATT TCAACAATGT GACAGGCCTA AGAGGTGCCT TCTCCaiG 

-5523 ATTTTTGAGG AGAAGGTTCT CSkAGATAAST TTTCTCACAC CTCTTTGAAT TACCTCCA 

-5463 TGTGTCCCCa TCACCATTAC CAGCAGCATT TGGACCCTrr TTCTGTTAGT CAGATGCT 

-5403 CCACCTCTTG AGGGTGTATA CT6TATGCTC TCTACACaGG AATATGCAGA GGAAATAG 

-5343 AAAGG6AAAT CGCATTACTA TTCftGAGAGA AGAAGACCTT TATGTGAATG AATGAGAG 

-5283 TAAAATCCTA AGAGAGCCCA TATAAAATTA TTACCAGTGC TAAAACTACA AAAGTTAC 

-5223 TAACAGTAAA CTAGRATAAT AAAACATGCA TCACAGTTGC TGGTAAAGCT AAATCAGA 

-5163 TTTTTTTCTT AGAAAAAGCA TTCCATGTGT GTTGCaGTGA TGAOMSGAGT GCCCTTCA 

-5103 CAATATGCTG CCTGTAATTT TTGTTCCCTG GCAGAATGTA TTGTCTTTTC TCCCTTTA 

-5043 TCTTAAATGC AAAACTAAAG GCAGCTCCTG GGCCCCCTCC CC3AAGTCAG CTGCCTGC 

-4983 CCAGCCCCAC GAAGAGCAGA GGCCTGAGCT TCCCTGGTCA AAATAGGGGG CTAGGGAG 

-4923 TAACCTTGCT CGATAAAGCT GTGTTCCCAG AATGTOGCTC CTGTTCCCaG GGGCACCA 
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-4863 



CTGGAGGGTG GTGAGCCTCA CTGGTGGCCT GATGCTTACC TTGTGCCCTC ACACCAGT 



-4803 TCACTGGAAC CTTCSAACACT TGGCTGTOGC CCGGATCTCC AGATGTCAAG AACTTCTC 

-4743 AOTCAAATTA CTGCCCACTT CTCCAGGGCA GATACCTOTG AACATCCAAA ACCATGCC 

-4683 AGAACCCTGC CTOGGGTCTA CAACACATAT GGACTGTGAG CACCAAGTCC AGCCCTGA 

-4623 CTGTGACCAC CTGCCAAGAT GCCCCTAACT GGGATCCACC AATCACTGCA CATGGCAG 

-4563 AGCGAGGCTT GGAGGTGCTT CGCCACAAGG CAGCCCCAAT TTGCTGGGAG TTTCTTGG 

-4503 CCTG6TAGT6 GTGAGGAGCC TTGGGACCCT CAGGATTACT CCCCTTAAGC ATAGTGGG 

-4443 CCCTTCTGCA TCCCCAGCAG GTGCCCCGCT CTTCAGAGCC TCTCTCTCTG AGGTTTAC 

-4383 AGACCCCTGC ACCAATGAGA CCATGCTGAA GCCTC3W3AGA GA6AGATGGA GCTTTGAC 

-4323 GGAGCOOCTC TTCCTTGAGG GCCAGOGCAO GQAAAGCAOG AGGCAGCACC AGGAGTGG 

-4263 ACaCCAGTGT CTAAGCCCCT GATGAGAACA GGGTGGTCTC TCCCATATGC CCATACCA 

-4203 CCT6TGAACA GAATCCTCCT TCTGCAGTGA CAATGTCTGA GAGGACGACA TGTTTCCC 

-4143 CCTAACGTGC AGCCATGCCC ATCTACCCAC TGCCTACTGC AGGACAGCAC CAACCC3«3 

-4083 GCTGGGAAGC TGGGAGAAGA CATGGAATAC CCATG6CTTC TCACCTTCCT CCAGTCC3V 

-4023 GGGCACCATT TATGCCTAGG ACACCCACCT GCCGGCCCCA GGCTCTTAAG AGTTAGGT 

-3963 CCTAGGTGCC TCTGGGAG6C CGAGGCAGGA GAATTGCTT6 AACCCGGGAG 6CAGAGGT 

-3903 CftGTGAGCCG AGATCACACC ACT6CACTCC AGCCTGGGTG ACAGAATGAG ACTCTGTC 

-3843 AAAAAAAAAG AGAAAGATAG CATCAGTGGC TACCAAGGGC TAGGGGCAGG GGAAGGTG 

-3783 GAGTTAATGA TTAATAGTAT GAAGTTTCTA TGTGAGATGA TGAAAATGTT CTGGAAAA 

-3723 AAATATAGTG GTGAGGATGT AGAATATTGT GAATATAATT AACGGCATTT AATTGTAC 

-3663 TTAACATGAT TAATGTGGCA TATTTTATCT TATGTATTT6 ACTACATCCA AGAAACAC 

-3603 GQAGAGGGAA AGCCCACCAT GTAAAATACA CCCACCCTAA TCAGATAGTC CTCATTGT 

-3543 CCAGGTACAG GCCCCTCATG ACCTGCACAG GAATAACTAA GGATTTAAGG ACATGAGG 
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-3483 TCCCaVGCCAA CTCCAGGTGC ACAACATAAA TGTATCTGCA AACAGACTGa GAGTAAAG 
-3423 GGGGGCACAA ACCTCaGCAC TGCCAGGACA CACaCCCTTC TOGTGGATTC TCACTTXA 
-3363 TCACCCGGCC CACTGTCCaG ATCTTGTTGT GGGATTGGGA CAAGGGAGGT CATAAAGC 
-3303 GTCCCCAGGG CACTCTGTGT GAGCACACGA GACCTCCCCA CCCCCCCACC GTTAGGTC 
-3243 CACACATAGA TCTGACCATT AGGCATTGTG AGGftGGACTC TA6CGCGGGC TCAGGGAT 

-3183 CACCAGAGAA TCAGGTACAG AGAGGAAGAC GGGGCTCGRG GAGCTGATGG ATGACACA 

-3123 GO^GGGTTCC TGCAGTCCAC AGGTCCAGCT CACCCTGGTG TAGGTGCCCC ATCCCCCT 

-3063 TCCAGGCATC CCTGACACAG CTCCCTCCCG GAGCCTCCTC CCAGGTGACA CATCAGGG 

-3003 CCTCACTCAA GCTGTCCAGA QAGGGCAGCA CCTTGGACAG CGCCCACCCC ACTTCACT 

-2S43 TCCTCCCTCA CAGGGCTCAG GGCTCAGGGC TCAAGTCTCA GAACAAATGG CAGAGGCC 

-2883 TSAGCCCAGA GATGGTGACA GGGCAATGAT CCAGGGGCAG CTGCCTGAAA CX3GGAGCA 

-2823 T6AAGCCACA GATGG6AGAA QATGGTPCAG GftAGAAAAAT CCAGGAATGG GCAGGAGA 

-2763 AGAGGAGGAC ACAGGCTCTG TGGGGCTGCA GCCCAG6ATG GGACTAAGTG TGAAGACA 

-2703 TCAGCAGCSTG AGGCCAGGTC CCATGAACAG A6AAGCAGCT CCCACCTCCC CTGATGCA 

-2643 GACACACAGA GTGTGTGGTG CTGTGCCCCC AGA6TCGGGC TCTCCTGTTC TGGTCCCC 

-2583 GGAGTGA(5AA GTCAGGTTCA CTTGTCCCTG CTCCTCTCT6 CTACCCCAAC ATTCACCT 

-2523 TCCTCATGCC CCTCTCTCTC AftATATGATT TGGATCTATG TCCCCGCCCA AATCTCAT 

-2463 CAAATTCTAA ACCCCAATGT TGGAGGTGGG GCCTTGTCAG AAGTGATTGG ATAATGCG 

-2403 TGGATTTTCT GCTTTCATGC TGTTTCTGT6 ATAGAGATCT CACATGATCT GGTTGTTT 

-2343 AAGTGTGTAG CACCTCTCCC CTCTCTCTCT CTCTCTCTTA CTCATGCTCT GCCATGTA 

-2283 ACGTTCCTGT TTCCCCTTCA CCGTCCAGAA TGATTGTAAG TTTTCTGAGG CCTCCCCA 

-2223 AGCAGAAGCC ACTATGCTTC CTGTACAACT GCAGAATGAT GAGCGAATTA AACCTCTT 

-2163 CTTTATAAAT TACCCAGTCT CAGGTATTTC TTTATAGCAA TGCGAGGACA QACTAATA 
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-2103 ATCTTCTACT CCCAGATCCC CGCACACGCT TAGCCCCAGA CATCACTGCC CCTGGGAG 

-2043 TGCACAGCGC AGCCTCCTGC CGACAAAAGC AAAGTCACAA AAGGTGACAA AAATCTGC 

-1983 TTGGGGACAT CTGATTGTGA AAGAGGGAGG ACAGTACACT TGTAGCCACA GAGACTGG 

-1923 CTCACCGAGC TGAAACCTGG TAGCACTTTG feCATAACATG TGCATGACCC GTGTTCAA 

-1863 TCTAGAGATC AGTGTTGAGT AAAACAGCCT GGTCTGGGGC CGCTGCTGTC CCCACTTC 

-1803 TCCTGTCCAC CAGAGGGCGG CAGAGTTCCT CCCACCCTGG AGCCTCCCCA GGGGCTGC 

-1743 ACCTCCCTCA GCCGGGCCCA CAGCCCAGCA GGGTCCACCC TCACCCGGGT CACCTCGG 

-1683 CACGTCCTCC TCGCCCTCCG AGCTCCTCAC ACGGACTCTG TCAGCTCCTC CCTGCAGC 

-1623 ATCGGCCGCC CACCTGAGGC TTGTCGGCCG CCCACTTGAG GCCTGTCGGC TGCCCTCT 

-1563 AGGCAGCTCC TGTCCCCTAC ACCCCCTCCT TCCCCGGGCT CAGCTGAAAG GGCGTCTC 

-1503 AGGGCAGCTC CCTGTGATCT CCAGGACAGC TCAGTCTCTC ACAGGCTCCG ACGCCCCC 

-1443 TGCTGTCACC TCACAGCCCT GTCATTACCA TTAACTCCTC AGTCCCATGA AGTTCACT 

-1383 GCGCCTGTCT CCCGGTTACA GGAAAACTCT GTGACAGGGA CCACGTCTGT CCTGCTCT 

-1323 GTGGAATCCC AGGGCCCAGC CCAGTGCCTG ACACGGAACA GAT6CTCCAT AAATACTG 

-1263 TAAATGTGTG GGAGATCTCT AAAAAGAAGC ATATCACCTC CGTGTGGCCC CCAGCAGT 

-1203 GAGTCTGTTC CATGTGGACA CAGGGGCACT GGCACCAGCA TGGGAGGAGG CCAGCAAG 

-1143 CCCX3CGGCTG CCCCAGGAAT GAGGCCTCAA CCCCCAGAGC TTCAG2JIGGG AGGACAGA 

-1083 CCTGCAGGGA ATAGATCCTC CGGCCTGACC CTGCAGCCTA ATCCAGAGTT CAGGGTCA 

-1023 TCACACCACG TCGACCCTGG TCAGCATCCC TAGGGCAGTT CCAGACAAGG CCGGAGGT 

-963 CCTCTTGCCC TCCAGGGGGT GACATTGCAC ACAGACATCA CTCAGGAAAC GGATTCCC 

-903 GGACAGGAAC CTGGCTTTGC TAAGGAAGTG GAGGTGGAGC CTGGTTTCCA TCCCTTGC 

-843 CAACAGACCC TTCTGATCTC TCCCACATAC CTGCTCTGTT CCTTTCTGGG TCCTATGA 

-783 ACCCTGTTCT GCCAGGGGTC CCTGTGCAAC TCCAGACTCC CTCCTGGTAC CACCATGG 
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-723 AAGGTGGGGT GATCACAG<a awma«5CCT CGCAG^ 

-663 GGGAGAACAT GGACAGGCCC TGAGCCGCAQ CTCAGCCAAC AGACACGGAG AGGGAGG6 
-603 CCCCTGGAGC CTTCCCCAAG GACaOCAGAG CCCAGAGTCA CCCACCTCCC TCCACCAC 
-543 TCCTCTCTTT CCAGC3ACACA CAAGACACCT CCCCCTCCAC ATGCAGGATC TGGGGACT 
-483 TGA<aCCTCT GGGCCTGGCT CTCCATCCCT GGGTCAGl^ CC«^ 
-423 ACAGAGGGCT GGTCCCTCCC CAGCCACCAC CCAGTGAfiCC TTTTTCTAGC CCCCAGAG 
-363 ACCTCTCTCA CCTTCCTGTT GGGCATCATC CCACCTTCCC AGAGCCCTGG AGAGCATG 
-303 GAGACCCGGG ACCCTGCTGG GTTTCTCTGT CACAAAGGAA AATAATCCCC CTGGTGTG 
-243 AGACCCAAGG ACAGAACACa GOMSAGGTCA GCACTO^ 
-183 <«»TGGGGGT CttTCCACCT TGCCGAAAAG ATTTGTCTGA GGAACT^ 
-123 AAAAAGAGGA GGGACAAAAG AGGCAGftAAT GAGAGGGGAG GGGACAGAGG A<aCCT^ 
-63 AAAGACCACA CCCATCACCC ACGTGATGCT GAGAACTACT CCT6CCCTAG GAAGAGAC 
'3 AGGGCAGAGG GAGGAAGGAC AGCAGACXa^G ACAGTCACAG CAGCCTTCAC AAAACGTT 
57 TGGAACTCAA GCTCTTCTCC ACAGAGGAGG ACAGAGCAGA CAGCAGAGAC CATGGAGT 
117 CCCTCGGCCC CTCCCCACAG ATGGTGCATC CCCTGGCAGA GGCTCCTGCT CACAGGTG 
177 GGGAGGACAA CCTGGGAGAG GGTGGGAGGA GGQAGCTGGG GTCT^ 
237 CTGTGAGACG GACAGAGOGC TCCTGTTGGA GCCT6AATAG GGAAG^ 
297 GACAGQAGTC AO^CCAGAAA AATOVAATTG AACTGGAATT GGAAAGGGGC AGG^ 
357 CAAGAGTTCT ATTTTCCTAG TTAATTGTCA CTGGCCACTA CGTTTTTAAA AATCATAA 
417 ACT6CATCAG ATGACACTTT AAATAAAAAC ATAACCAGGG CATGAAACAC TGTCCTCA 
477 COCCTACCGC GGACATTGGA AAATAA6CCC CAGGCTGTGG AGGGCCCTGG GAACCCTC 
537 GAACTCATCC ACAGGAATCT GCaGCCTGTC CCAGGCACTG GGGTGCAACC AAGATC 

Fig.6(n/W 
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